From 22488147c3c0cb9b4148b7d2565c88bafff2f93e Mon Sep 17 00:00:00 2001 From: "Andrei G." Date: Wed, 29 Apr 2026 13:34:47 +0200 Subject: [PATCH 1/3] chore: add Copilot custom instructions for code review Repository-wide instructions (.github/copilot-instructions.md) cover architecture overview, lock scope rules, glob anchoring, channel semantics, and deny.toml license gaps. Path-specific instruction files cover: - bridge/lsp: TOCTOU, version overflow, notification_pump deadlock, file watcher - config/deps: license allow list, serde round-trip tests, TOML defaults - mcp: 1-based positions, JsonSchema derive, tool dispatch completeness - tests: #[ignore] hygiene, resync coverage, glob pattern correctness - ci: feature flag parity, RUSTDOCFLAGS, MSRV guards - CHANGELOG: breaking change format, PR references Instructions are derived from findings in PR #101 and #103 reviews. --- .github/copilot-instructions.md | 58 +++++++++++++++++++ .../instructions/changelog.instructions.md | 14 +++++ .github/instructions/ci.instructions.md | 30 ++++++++++ .../instructions/mcp-tools.instructions.md | 25 ++++++++ .../instructions/rust-bridge.instructions.md | 55 ++++++++++++++++++ .../instructions/rust-config.instructions.md | 22 +++++++ .github/instructions/tests.instructions.md | 33 +++++++++++ 7 files changed, 237 insertions(+) create mode 100644 .github/copilot-instructions.md create mode 100644 .github/instructions/changelog.instructions.md create mode 100644 .github/instructions/ci.instructions.md create mode 100644 .github/instructions/mcp-tools.instructions.md create mode 100644 .github/instructions/rust-bridge.instructions.md create mode 100644 .github/instructions/rust-config.instructions.md create mode 100644 .github/instructions/tests.instructions.md diff --git a/.github/copilot-instructions.md b/.github/copilot-instructions.md new file mode 100644 index 0000000..93909d2 --- /dev/null +++ b/.github/copilot-instructions.md @@ -0,0 +1,58 @@ +# Copilot Instructions for mcpls + +mcpls is a bridge between MCP (Model Context Protocol) and LSP (Language Server Protocol). +AI clients speak MCP to mcpls; mcpls spawns and communicates with language servers over LSP. + +## Architecture + +``` +AI Client ←→ [MCP/stdio] ←→ mcpls ←→ [LSP/stdio] ←→ rust-analyzer / tsgo / pyright / ... +``` + +Key crates: +- `mcpls-core/src/bridge/` — MCP→LSP translation, document state, diagnostics cache +- `mcpls-core/src/lsp/` — LSP client, JSON-RPC 2.0 lifecycle, file watcher +- `mcpls-core/src/mcp/` — MCP server, tool definitions and dispatch +- `mcpls-core/src/config/` — TOML config, LSP server discovery heuristics + +## Code Review Priorities + +### Correctness + +- **Position encoding**: MCP is 1-based, LSP is 0-based. All conversions must go through + `bridge/encoding.rs`. Flag any direct arithmetic on line/column values outside that module. + +- **LSP protocol compliance**: Inbound JSON-RPC messages fall into four shapes: + `method`+`id` → server request, `method` only → notification, + `id`+(`result`|`error`) → response, anything else → protocol error. + Responses without `result` or `error` must not be silently treated as `null`. + +- **Document versioning**: `textDocument/didOpen` version numbers must be strictly + monotone per document URI. After `didClose`/`didOpen` resync cycles, reset to 1 + rather than saturating at `i32::MAX`. + +- **Glob patterns**: `globset` matches against the full absolute path. Bare patterns + like `*.rs` only match filenames with no directory component. Patterns without a + leading `/` or `**/` prefix must be anchored with `**/` before passing to `GlobSetBuilder`. + +### Concurrency + +- **Lock scope**: `Arc>` must never be held across async LSP I/O + (notify calls, request round-trips). If a lock guard is held while awaiting a network + operation, flag it — this creates head-of-line blocking between request handlers and + the `notification_pump`. + +- **Channel semantics**: `std::sync::mpsc::SyncSender::send` blocks when the channel is + full; it does not drop. Use `try_send` when the intent is drop-on-full. + +### Dependencies + +- Check `deny.toml` allow list when new crates are introduced. Licenses not in the list + will fail `cargo deny check licenses` in CI. Common gap: `CC0-1.0` (used by `notify`). + +## Style + +- All `pub` types, traits, functions, and methods must have `///` doc comments explaining + *what* and *why*, not just restating the name. +- No `unsafe` code — `deny(unsafe_code)` is enforced workspace-wide. +- `#[non_exhaustive]` is required on any public enum that may gain variants in future releases. diff --git a/.github/instructions/changelog.instructions.md b/.github/instructions/changelog.instructions.md new file mode 100644 index 0000000..458b1b8 --- /dev/null +++ b/.github/instructions/changelog.instructions.md @@ -0,0 +1,14 @@ +--- +applyTo: "CHANGELOG.md" +--- + +## CHANGELOG review checklist + +- Breaking changes to public APIs (new enum variants, removed fields, changed + signatures) must appear under `### Changed` with an explicit "Breaking change:" + prefix. A `### Fixed` entry alone is not sufficient. +- New `#[non_exhaustive]` enums or structs must be mentioned: downstream crates + that match exhaustively need to add a wildcard arm. +- Every entry must reference the PR or issue number in parentheses: `(#NNN)`. +- Entries go under `## [Unreleased]` until a release PR assigns a version. + Do not add entries directly under a versioned section. diff --git a/.github/instructions/ci.instructions.md b/.github/instructions/ci.instructions.md new file mode 100644 index 0000000..b4fb459 --- /dev/null +++ b/.github/instructions/ci.instructions.md @@ -0,0 +1,30 @@ +--- +applyTo: ".github/workflows/**,deny.toml,.cargo/**" +--- + +## CI and toolchain review checklist + +### Workflows + +- Feature flag sets in CI jobs must match the local pre-commit commands in `CLAUDE.md` + exactly: `--all-features` for clippy and nextest, `--no-deps --all-features` for + rustdoc. Divergence causes "passes locally, fails CI" issues. +- `RUSTDOCFLAGS="-D warnings"` must be set in the rustdoc job. Without it, broken + intra-doc links and missing doc comments pass silently. +- Nightly toolchain is required only for `cargo +nightly fmt`. All other jobs must use + stable. Pinning nightly globally causes unnecessary breakage on nightly regressions. + +### deny.toml + +- New dependencies require a license check before merge. Add the license to the + `allow` list in the same PR that introduces the crate. Do not merge a PR that + introduces a new crate without verifying `cargo deny check licenses` passes. +- Known unlisted licenses that will fail CI: `CC0-1.0` (used by `notify`). +- `cargo deny check advisories` must be clean. A new RUSTSEC advisory in a transitive + dependency is a blocker even if the vulnerable code path is not exercised. + +### MSRV + +- `rust-version` in `Cargo.toml` is the enforced minimum. Any API stabilised after + that version requires either a version bump (document in CHANGELOG) or a conditional + compile guard. diff --git a/.github/instructions/mcp-tools.instructions.md b/.github/instructions/mcp-tools.instructions.md new file mode 100644 index 0000000..c122d6a --- /dev/null +++ b/.github/instructions/mcp-tools.instructions.md @@ -0,0 +1,25 @@ +--- +applyTo: "crates/mcpls-core/src/mcp/**" +--- + +## MCP layer review checklist + +### Tool parameter structs (`tools.rs`) + +- Every position parameter (`line`, `character`) must be documented as 1-based in both + the `///` doc comment and the `#[schemars(description)]` annotation. LSP is 0-based; + the conversion happens in `bridge/encoding.rs` — not here. +- `file_path` parameters must accept absolute paths only. Doc comments must state this + explicitly so AI clients don't pass relative paths. +- New tool structs require `#[derive(JsonSchema)]` — omitting it silently removes the + tool from the MCP schema exposed to clients. + +### Tool dispatch (`handlers.rs`, `server.rs`) + +- Every tool handler must call `ensure_open` (via `Translator`) before issuing any LSP + request. Skipping it produces stale results when the file has changed on disk since + the last access. +- Tool errors must propagate as MCP error responses, not panics. `unwrap()` and + `expect()` are not acceptable in handler code. +- New tools must be registered in both the tool list and the dispatch match arm. + A tool missing from either will either never appear to clients or panic at runtime. diff --git a/.github/instructions/rust-bridge.instructions.md b/.github/instructions/rust-bridge.instructions.md new file mode 100644 index 0000000..aa319b5 --- /dev/null +++ b/.github/instructions/rust-bridge.instructions.md @@ -0,0 +1,55 @@ +--- +applyTo: "crates/mcpls-core/src/bridge/**,crates/mcpls-core/src/lsp/**,crates/mcpls-core/src/lib.rs" +--- + +## Bridge and LSP layer review checklist + +### Document state (`bridge/state.rs`) + +- `ensure_open` stats the file on every call and compares `(mtime, size)` against the + cached `SyncSignature`. On mismatch it must send `didClose` then `didOpen` with a + bumped version before returning. Verify this resync path is exercised by a test. +- Stat must happen on the same `File` handle as the subsequent read, or be re-done after + the read, to avoid TOCTOU: an atomic `rename(2)` between stat and `read_to_string` on + a coarse-mtime filesystem (ext4: 1 s, FAT32: 2 s) leaves the cached signature stale + permanently. +- Version counter must reset to 1 on overflow, not saturate. `saturating_add(1)` at + `i32::MAX` sends the same version on every subsequent resync; rust-analyzer silently + discards non-monotone `didOpen` notifications. + +### Diagnostics pipeline (`bridge/translator.rs`, `bridge/notifications.rs`) + +- `DiagnosticsMode::Cached` returns an empty `Vec` for both "no errors" and "server + has not yet analysed this file". AI clients cannot distinguish these. Flag if callers + treat empty as "clean". +- `merge_diagnostics` dedup key is `(range, severity, code)` when `code` is present, + falling back to a path-qualifier-stripped message otherwise. Doc comments that say + `(range, message, code)` are wrong — flag them. +- `pull_diagnostics` has a 30 s timeout. In `Hybrid` mode it runs before the cache read. + The cache read is a synchronous hashmap lookup; it should not be blocked behind the + pull. Prefer `tokio::join!` for the two sources. + +### Notification pump (`lib.rs`) + +- `notification_pump` must hold only a `NotificationCache` lock, never the full + `Translator` lock. Holding `Mutex` across an LSP round-trip in a request + handler while the pump waits for the same lock causes head-of-line deadlock when the + notification channel fills. + +### File watcher (`lsp/file_watcher.rs`) + +- `GlobPattern::String` patterns without a leading `/` or `**/` must be anchored with + `**/` before passing to `globset`. Bare `*.rs` does not match `/repo/src/lib.rs`. +- `SyncSender::send` blocks when the channel is full. Use `try_send` with a warn log + when the documented intent is drop-on-full. +- `NEVER_FORWARD_COMPONENTS` filtering should happen before the channel send, not after, + to avoid burning channel capacity on `target/` churn during `cargo build`. +- `register()` overwrites duplicate registration IDs silently. LSP spec treats + re-registration as an error; at minimum log a warning. + +### LSP client (`lsp/client.rs`, `lsp/transport.rs`) + +- Server-to-client requests (`method` + `id`) must receive a JSON-RPC response. + Unhandled methods should return error code `-32601` (Method Not Found), not be dropped. +- `id`-only messages (no `method`, no `result`, no `error`) are protocol errors and must + not be treated as successful `null` responses. diff --git a/.github/instructions/rust-config.instructions.md b/.github/instructions/rust-config.instructions.md new file mode 100644 index 0000000..c794dc3 --- /dev/null +++ b/.github/instructions/rust-config.instructions.md @@ -0,0 +1,22 @@ +--- +applyTo: "crates/mcpls-core/src/config/**,deny.toml,Cargo.toml,crates/*/Cargo.toml" +--- + +## Config and dependency review checklist + +### deny.toml + +- When a new crate is introduced, verify its license appears in the `allow` list. + `cargo deny check licenses` runs in CI and will fail on first run if the license is + missing. Known gap: `CC0-1.0` (used by `notify`). +- Check `cargo deny check advisories` output for any new RUSTSEC advisories in added + dependencies. + +### Config (`config/mod.rs`) + +- New `#[serde(default)]` fields on public config structs must have a test covering + the omitted-key case (TOML round-trip, not just JSON). +- Enum config options must include serde round-trip tests for every variant, both JSON + and TOML, since `#[serde(rename_all)]` applies to both serializers independently. +- LSP server discovery heuristics (file-pattern matching) must be case-insensitive on + the extension comparison — APFS and NTFS are case-insensitive filesystems. diff --git a/.github/instructions/tests.instructions.md b/.github/instructions/tests.instructions.md new file mode 100644 index 0000000..2ea28bc --- /dev/null +++ b/.github/instructions/tests.instructions.md @@ -0,0 +1,33 @@ +--- +applyTo: "crates/mcpls-core/tests/**,crates/mcpls-core/src/**/tests.rs,crates/mcpls-core/src/**/*_tests.rs" +--- + +## Test review checklist + +### Integration tests (`tests/integration/`) + +- Tests requiring a live LSP binary (`rust-analyzer`, `tsgo`, etc.) must be marked + `#[ignore = "Requires in PATH"]`. Unmarked tests that shell out to LSP + binaries will break CI on clean runners. +- `rust_analyzer_tests.rs` probes must verify the post-PR #103 diagnostic pipeline: + `get_diagnostics` should return errors for files with known type errors, and + `get_cached_diagnostics` should return the same set (non-empty) after a + `publishDiagnostics` notification has been received. + +### E2e tests (`tests/e2e/`) + +- Every e2e test spawns the mcpls binary. If the binary is not pre-built, the test + must be `#[ignore]`. Document the required build step in the ignore message. +- Position values in e2e assertions must use 1-based line/column (MCP convention). + Using 0-based values will produce off-by-one failures that are hard to diagnose. + +### Unit tests + +- `ensure_open` resync path (signature mismatch → `didClose` + `didOpen`) must be + covered by an async test using a temp file and a mock `LspClient`. The stat-only + test (`test_stat_signature_changes_when_file_grows`) does not cover this path. +- `FileWatcher::register` and `compute_changes` require unit tests. New glob + registrations must be verified to actually match the intended file paths using + absolute path strings — bare patterns like `*.rs` will not match. +- `DiagnosticsMode` serde tests must cover all three variants in both JSON and TOML + deserialization, including the omitted-key default. From b43466eb0b5e7a763ab90875bdce6b1b39cdcf60 Mon Sep 17 00:00:00 2001 From: "Andrei G." Date: Wed, 29 Apr 2026 13:41:21 +0200 Subject: [PATCH 2/3] chore: rewrite Copilot instructions in general form Remove project-specific file references, line numbers, and PR mentions. Instructions now describe general principles with examples only where needed to illustrate correct behaviour. --- .github/copilot-instructions.md | 71 +++++----- .../instructions/changelog.instructions.md | 24 ++-- .github/instructions/ci.instructions.md | 37 +++-- .../instructions/mcp-tools.instructions.md | 46 ++++--- .../instructions/rust-bridge.instructions.md | 127 +++++++++++------- .../instructions/rust-config.instructions.md | 36 +++-- .github/instructions/tests.instructions.md | 58 ++++---- 7 files changed, 219 insertions(+), 180 deletions(-) diff --git a/.github/copilot-instructions.md b/.github/copilot-instructions.md index 93909d2..65d52a2 100644 --- a/.github/copilot-instructions.md +++ b/.github/copilot-instructions.md @@ -3,56 +3,57 @@ mcpls is a bridge between MCP (Model Context Protocol) and LSP (Language Server Protocol). AI clients speak MCP to mcpls; mcpls spawns and communicates with language servers over LSP. -## Architecture - ``` -AI Client ←→ [MCP/stdio] ←→ mcpls ←→ [LSP/stdio] ←→ rust-analyzer / tsgo / pyright / ... +AI Client ←→ [MCP/stdio] ←→ mcpls ←→ [LSP/stdio] ←→ language server ``` -Key crates: -- `mcpls-core/src/bridge/` — MCP→LSP translation, document state, diagnostics cache -- `mcpls-core/src/lsp/` — LSP client, JSON-RPC 2.0 lifecycle, file watcher -- `mcpls-core/src/mcp/` — MCP server, tool definitions and dispatch -- `mcpls-core/src/config/` — TOML config, LSP server discovery heuristics +Key crates: `mcpls-core/src/bridge/`, `lsp/`, `mcp/`, `config/`. -## Code Review Priorities +## Correctness -### Correctness +**Position encoding** — MCP is 1-based, LSP is 0-based. All line/column conversions must +go through the dedicated encoding module. Direct arithmetic on position values outside +that module is a bug. -- **Position encoding**: MCP is 1-based, LSP is 0-based. All conversions must go through - `bridge/encoding.rs`. Flag any direct arithmetic on line/column values outside that module. +**LSP JSON-RPC message classification** — four shapes exist: +- `method` + `id` → server-to-client request (must be replied to) +- `method` only → notification (fire-and-forget) +- `id` + `result` or `error` → response +- anything else → protocol error -- **LSP protocol compliance**: Inbound JSON-RPC messages fall into four shapes: - `method`+`id` → server request, `method` only → notification, - `id`+(`result`|`error`) → response, anything else → protocol error. - Responses without `result` or `error` must not be silently treated as `null`. +A message with `id` but no `method`, `result`, or `error` is invalid and must not be +silently treated as a successful `null` response. -- **Document versioning**: `textDocument/didOpen` version numbers must be strictly - monotone per document URI. After `didClose`/`didOpen` resync cycles, reset to 1 - rather than saturating at `i32::MAX`. +**Document version monotonicity** — `textDocument/didOpen` version numbers must strictly +increase per URI across the document's lifetime. On counter overflow, reset to 1 rather +than saturating — after `didClose`/`didOpen` the server treats the document as fresh +regardless of the version number. -- **Glob patterns**: `globset` matches against the full absolute path. Bare patterns - like `*.rs` only match filenames with no directory component. Patterns without a - leading `/` or `**/` prefix must be anchored with `**/` before passing to `GlobSetBuilder`. +**Filesystem signature freshness** — when caching file state with `(mtime, size)`, stat +and read must be bound to the same moment. A stat before the read creates a TOCTOU +window: on filesystems with low mtime resolution a same-size rewrite within a single +clock tick leaves the signature unchanged while content has changed. -### Concurrency +## Concurrency -- **Lock scope**: `Arc>` must never be held across async LSP I/O - (notify calls, request round-trips). If a lock guard is held while awaiting a network - operation, flag it — this creates head-of-line blocking between request handlers and - the `notification_pump`. +**Lock granularity** — a mutex that guards a large struct must not be held across async +I/O. If a background task and a request handler share the same lock, and the request +handler awaits a network round-trip while holding it, the background task stalls. When +the background channel fills, the network response can no longer be delivered — +head-of-line deadlock. Prefer fine-grained locks scoped to the data they protect. -- **Channel semantics**: `std::sync::mpsc::SyncSender::send` blocks when the channel is - full; it does not drop. Use `try_send` when the intent is drop-on-full. +**Channel drop-on-full semantics** — `std::sync::mpsc::SyncSender::send` blocks when +the channel is full; it does not drop. Use `try_send` when the intent is to discard +events under backpressure and never block the sender thread. -### Dependencies +## Dependencies -- Check `deny.toml` allow list when new crates are introduced. Licenses not in the list - will fail `cargo deny check licenses` in CI. Common gap: `CC0-1.0` (used by `notify`). +When a new crate is introduced, verify its license is in the `cargo deny` allow list. +A missing license will fail `cargo deny check licenses` in CI on the first run. ## Style -- All `pub` types, traits, functions, and methods must have `///` doc comments explaining - *what* and *why*, not just restating the name. +- All `pub` items must have `///` doc comments explaining *what* and *why*. - No `unsafe` code — `deny(unsafe_code)` is enforced workspace-wide. -- `#[non_exhaustive]` is required on any public enum that may gain variants in future releases. +- Public enums that may gain variants must be marked `#[non_exhaustive]`, and breaking + additions must be documented under `### Changed` in CHANGELOG. diff --git a/.github/instructions/changelog.instructions.md b/.github/instructions/changelog.instructions.md index 458b1b8..42d7133 100644 --- a/.github/instructions/changelog.instructions.md +++ b/.github/instructions/changelog.instructions.md @@ -2,13 +2,17 @@ applyTo: "CHANGELOG.md" --- -## CHANGELOG review checklist - -- Breaking changes to public APIs (new enum variants, removed fields, changed - signatures) must appear under `### Changed` with an explicit "Breaking change:" - prefix. A `### Fixed` entry alone is not sufficient. -- New `#[non_exhaustive]` enums or structs must be mentioned: downstream crates - that match exhaustively need to add a wildcard arm. -- Every entry must reference the PR or issue number in parentheses: `(#NNN)`. -- Entries go under `## [Unreleased]` until a release PR assigns a version. - Do not add entries directly under a versioned section. +## Changelog format + +Breaking changes to public APIs must appear under `### Changed` with an explicit +"Breaking change:" prefix. A `### Fixed` entry alone is not sufficient — downstream +users scanning the changelog for breakage will miss it. + +Additions of `#[non_exhaustive]` to existing public enums or structs are breaking +changes for crates that match exhaustively. Document the required migration (add a +wildcard arm, update struct initialisation) in the entry. + +Every entry must reference the PR or issue number: `(#NNN)`. + +New entries go under `## [Unreleased]`. Do not add entries directly under a versioned +section — version assignment happens in the release PR. diff --git a/.github/instructions/ci.instructions.md b/.github/instructions/ci.instructions.md index b4fb459..b82645c 100644 --- a/.github/instructions/ci.instructions.md +++ b/.github/instructions/ci.instructions.md @@ -2,29 +2,26 @@ applyTo: ".github/workflows/**,deny.toml,.cargo/**" --- -## CI and toolchain review checklist +## Workflow correctness -### Workflows +Feature flag sets in CI jobs must match the pre-commit commands documented in +`CLAUDE.md` exactly. Divergence causes "passes locally, fails CI" failures that are +hard to diagnose — the developer sees green locally and red in CI for the same code. -- Feature flag sets in CI jobs must match the local pre-commit commands in `CLAUDE.md` - exactly: `--all-features` for clippy and nextest, `--no-deps --all-features` for - rustdoc. Divergence causes "passes locally, fails CI" issues. -- `RUSTDOCFLAGS="-D warnings"` must be set in the rustdoc job. Without it, broken - intra-doc links and missing doc comments pass silently. -- Nightly toolchain is required only for `cargo +nightly fmt`. All other jobs must use - stable. Pinning nightly globally causes unnecessary breakage on nightly regressions. +The rustdoc job must set `RUSTDOCFLAGS="-D warnings"`. Without it, broken intra-doc +links and undocumented public items pass silently in CI while failing for downstream +users who build with warnings-as-errors. -### deny.toml +Nightly toolchain is required only for `cargo fmt`. All other jobs must use stable. +Pinning nightly globally causes unnecessary pipeline failures when nightly introduces +a regression unrelated to this project. -- New dependencies require a license check before merge. Add the license to the - `allow` list in the same PR that introduces the crate. Do not merge a PR that - introduces a new crate without verifying `cargo deny check licenses` passes. -- Known unlisted licenses that will fail CI: `CC0-1.0` (used by `notify`). -- `cargo deny check advisories` must be clean. A new RUSTSEC advisory in a transitive - dependency is a blocker even if the vulnerable code path is not exercised. +## Dependency policy -### MSRV +New dependencies must have their license added to the `deny.toml` allow list in the +same PR. Do not merge a PR that introduces a new crate without verifying that +`cargo deny check licenses` passes. License checks fail fast and block the entire +pipeline. -- `rust-version` in `Cargo.toml` is the enforced minimum. Any API stabilised after - that version requires either a version bump (document in CHANGELOG) or a conditional - compile guard. +`cargo deny check advisories` must be clean. Suppress an advisory only with an explicit +`ignore` entry and a comment explaining why the vulnerable code path is not reachable. diff --git a/.github/instructions/mcp-tools.instructions.md b/.github/instructions/mcp-tools.instructions.md index c122d6a..6e63abf 100644 --- a/.github/instructions/mcp-tools.instructions.md +++ b/.github/instructions/mcp-tools.instructions.md @@ -2,24 +2,28 @@ applyTo: "crates/mcpls-core/src/mcp/**" --- -## MCP layer review checklist - -### Tool parameter structs (`tools.rs`) - -- Every position parameter (`line`, `character`) must be documented as 1-based in both - the `///` doc comment and the `#[schemars(description)]` annotation. LSP is 0-based; - the conversion happens in `bridge/encoding.rs` — not here. -- `file_path` parameters must accept absolute paths only. Doc comments must state this - explicitly so AI clients don't pass relative paths. -- New tool structs require `#[derive(JsonSchema)]` — omitting it silently removes the - tool from the MCP schema exposed to clients. - -### Tool dispatch (`handlers.rs`, `server.rs`) - -- Every tool handler must call `ensure_open` (via `Translator`) before issuing any LSP - request. Skipping it produces stale results when the file has changed on disk since - the last access. -- Tool errors must propagate as MCP error responses, not panics. `unwrap()` and - `expect()` are not acceptable in handler code. -- New tools must be registered in both the tool list and the dispatch match arm. - A tool missing from either will either never appear to clients or panic at runtime. +## Tool parameter types + +Every position parameter must be documented as 1-based in both the doc comment and the +schema description. The MCP↔LSP coordinate conversion happens in the bridge layer, not +in the tool definition — the doc comment is the contract visible to AI clients. + +Path parameters must accept absolute paths. State this explicitly in the description so +clients do not pass relative paths that silently resolve to wrong locations. + +New tool structs require `#[derive(JsonSchema)]`. Omitting it silently removes the tool +from the schema exposed to clients — callers cannot discover or invoke it. + +## Tool dispatch + +Every tool handler must ensure the target document is open and in sync with disk before +issuing an LSP request. Skipping the open step returns stale results when the file was +modified externally since the last access. + +Errors must propagate as structured MCP error responses. `unwrap()` and `expect()` are +not acceptable in handler code — a panic kills the entire server process and drops all +active sessions. + +When adding a new tool, register it in both the tool list (capability advertisement) and +the dispatch match arm. A tool present in only one of the two will either never be +discoverable by clients or panic at runtime when called. diff --git a/.github/instructions/rust-bridge.instructions.md b/.github/instructions/rust-bridge.instructions.md index aa319b5..97517a5 100644 --- a/.github/instructions/rust-bridge.instructions.md +++ b/.github/instructions/rust-bridge.instructions.md @@ -2,54 +2,79 @@ applyTo: "crates/mcpls-core/src/bridge/**,crates/mcpls-core/src/lsp/**,crates/mcpls-core/src/lib.rs" --- -## Bridge and LSP layer review checklist - -### Document state (`bridge/state.rs`) - -- `ensure_open` stats the file on every call and compares `(mtime, size)` against the - cached `SyncSignature`. On mismatch it must send `didClose` then `didOpen` with a - bumped version before returning. Verify this resync path is exercised by a test. -- Stat must happen on the same `File` handle as the subsequent read, or be re-done after - the read, to avoid TOCTOU: an atomic `rename(2)` between stat and `read_to_string` on - a coarse-mtime filesystem (ext4: 1 s, FAT32: 2 s) leaves the cached signature stale - permanently. -- Version counter must reset to 1 on overflow, not saturate. `saturating_add(1)` at - `i32::MAX` sends the same version on every subsequent resync; rust-analyzer silently - discards non-monotone `didOpen` notifications. - -### Diagnostics pipeline (`bridge/translator.rs`, `bridge/notifications.rs`) - -- `DiagnosticsMode::Cached` returns an empty `Vec` for both "no errors" and "server - has not yet analysed this file". AI clients cannot distinguish these. Flag if callers - treat empty as "clean". -- `merge_diagnostics` dedup key is `(range, severity, code)` when `code` is present, - falling back to a path-qualifier-stripped message otherwise. Doc comments that say - `(range, message, code)` are wrong — flag them. -- `pull_diagnostics` has a 30 s timeout. In `Hybrid` mode it runs before the cache read. - The cache read is a synchronous hashmap lookup; it should not be blocked behind the - pull. Prefer `tokio::join!` for the two sources. - -### Notification pump (`lib.rs`) - -- `notification_pump` must hold only a `NotificationCache` lock, never the full - `Translator` lock. Holding `Mutex` across an LSP round-trip in a request - handler while the pump waits for the same lock causes head-of-line deadlock when the - notification channel fills. - -### File watcher (`lsp/file_watcher.rs`) - -- `GlobPattern::String` patterns without a leading `/` or `**/` must be anchored with - `**/` before passing to `globset`. Bare `*.rs` does not match `/repo/src/lib.rs`. -- `SyncSender::send` blocks when the channel is full. Use `try_send` with a warn log - when the documented intent is drop-on-full. -- `NEVER_FORWARD_COMPONENTS` filtering should happen before the channel send, not after, - to avoid burning channel capacity on `target/` churn during `cargo build`. -- `register()` overwrites duplicate registration IDs silently. LSP spec treats - re-registration as an error; at minimum log a warning. - -### LSP client (`lsp/client.rs`, `lsp/transport.rs`) - -- Server-to-client requests (`method` + `id`) must receive a JSON-RPC response. - Unhandled methods should return error code `-32601` (Method Not Found), not be dropped. -- `id`-only messages (no `method`, no `result`, no `error`) are protocol errors and must - not be treated as successful `null` responses. +## Document state + +Lazy document opening must re-verify on-disk state on every access, not only on first +open. Cache a filesystem signature (e.g. `mtime` + file size) alongside the document +state and re-read the file when the signature changes. On mismatch, send `didClose` +then `didOpen` with a bumped version before serving the request. + +Stat and read must be bound to the same file handle to avoid TOCTOU. Performing a stat, +then separately opening the file for reading, creates a window where an atomic replace +(`rename`) can change the content without changing the signature. + +Version numbers must remain strictly monotone per URI. On counter overflow, reset to 1 +— after `didClose`/`didOpen` the server treats the document as a fresh open regardless +of the version value. + +## Diagnostics + +When a feature can source results from multiple channels (e.g. pull request and push +cache), the channels should be fetched concurrently rather than sequentially. Blocking +a cheap synchronous lookup behind a slow network round-trip wastes time. + +An empty result must be distinguishable from an absent result. Returning an empty +collection for "server has not yet analysed this file" is indistinguishable from "no +errors found" to callers. Either surface a cache-miss indicator or fall back to a +live request on cache miss. + +Deduplication keys should prefer structured fields (e.g. error code) over free-form +message text. When a code is present, key on `(range, severity, code)`. Fall back to +a normalised message only when no code exists. Document the actual key in the doc +comment — mismatches between the comment and the implementation cause silent bugs. + +## Notification pump + +A background notification pump must hold only the narrowest lock needed to write to +the cache. It must not share a lock with request handlers that hold it across network +I/O. If pump and handler compete for the same mutex, the handler's in-flight network +response can be blocked from arriving — the system deadlocks until the request timeout +fires. + +## File watcher + +Glob patterns from LSP registrations must be matched against full absolute paths. +`globset` does not implicitly anchor bare patterns to any directory. A pattern like +`*.rs` matches only a filename with no path component. Prepend `**/` to bare patterns +before compiling them into a glob set. Example: + +```rust +// wrong: "*.rs" won't match "/repo/src/lib.rs" +// correct: "**/*.rs" matches any .rs file at any depth +let anchored = if !pattern.starts_with('/') && !pattern.starts_with("**/") { + format!("**/{pattern}") +} else { + pattern.to_owned() +}; +``` + +Event filtering (ignoring build artifacts, VCS directories, etc.) should happen before +events enter a bounded channel, not after. Filtering after send wastes channel capacity +and risks stalling the OS watcher thread if the channel fills. + +`SyncSender::send` blocks when the channel is full — it does not drop. Use `try_send` +when the intent is to discard events under backpressure. + +Duplicate `registerCapability` IDs must be handled explicitly. The LSP spec treats +re-registration as an error; silently overwriting the old registration loses the +previous glob set without notice. + +## LSP client + +Server-to-client requests (messages with both `method` and `id`) must receive a +JSON-RPC response. Unhandled methods must return error code `-32601` (Method Not Found) +— dropping them silently causes the server to wait indefinitely for an acknowledgement. + +Response classification must require either `result` or `error` to be present. A +message with only `id` and no `method`, `result`, or `error` is a protocol error and +must not be coerced into a successful `null` response. diff --git a/.github/instructions/rust-config.instructions.md b/.github/instructions/rust-config.instructions.md index c794dc3..78a1f24 100644 --- a/.github/instructions/rust-config.instructions.md +++ b/.github/instructions/rust-config.instructions.md @@ -2,21 +2,29 @@ applyTo: "crates/mcpls-core/src/config/**,deny.toml,Cargo.toml,crates/*/Cargo.toml" --- -## Config and dependency review checklist +## Configuration -### deny.toml +New `#[serde(default)]` fields must have a test that deserialises a config with the +field omitted and asserts the default value. Both JSON and TOML paths should be covered +— `#[serde(rename_all)]` applies uniformly, but TOML and JSON parsers can behave +differently for edge cases like empty strings and null values. -- When a new crate is introduced, verify its license appears in the `allow` list. - `cargo deny check licenses` runs in CI and will fail on first run if the license is - missing. Known gap: `CC0-1.0` (used by `notify`). -- Check `cargo deny check advisories` output for any new RUSTSEC advisories in added - dependencies. +Enum config options must include a round-trip test for every variant covering at least +one serialisation format. A rename or typo in `#[serde(rename_all = "...")]` silently +changes the on-disk format and breaks existing configs. -### Config (`config/mod.rs`) +## Dependencies -- New `#[serde(default)]` fields on public config structs must have a test covering - the omitted-key case (TOML round-trip, not just JSON). -- Enum config options must include serde round-trip tests for every variant, both JSON - and TOML, since `#[serde(rename_all)]` applies to both serializers independently. -- LSP server discovery heuristics (file-pattern matching) must be case-insensitive on - the extension comparison — APFS and NTFS are case-insensitive filesystems. +When a new crate is introduced, check that its license appears in the `deny.toml` allow +list before merging. `cargo deny check licenses` fails fast in CI and blocks the entire +pipeline. Add the license in the same PR that adds the crate. + +`cargo deny check advisories` must be clean. A RUSTSEC advisory in a transitive +dependency is a blocker even when the vulnerable code path is not exercised by this +project. + +## MSRV + +`rust-version` in `Cargo.toml` is the enforced minimum supported Rust version. Any +stabilised API used in new code must have been available since that version. If a newer +API is needed, bump `rust-version` and document the change in CHANGELOG. diff --git a/.github/instructions/tests.instructions.md b/.github/instructions/tests.instructions.md index 2ea28bc..20603de 100644 --- a/.github/instructions/tests.instructions.md +++ b/.github/instructions/tests.instructions.md @@ -2,32 +2,32 @@ applyTo: "crates/mcpls-core/tests/**,crates/mcpls-core/src/**/tests.rs,crates/mcpls-core/src/**/*_tests.rs" --- -## Test review checklist - -### Integration tests (`tests/integration/`) - -- Tests requiring a live LSP binary (`rust-analyzer`, `tsgo`, etc.) must be marked - `#[ignore = "Requires in PATH"]`. Unmarked tests that shell out to LSP - binaries will break CI on clean runners. -- `rust_analyzer_tests.rs` probes must verify the post-PR #103 diagnostic pipeline: - `get_diagnostics` should return errors for files with known type errors, and - `get_cached_diagnostics` should return the same set (non-empty) after a - `publishDiagnostics` notification has been received. - -### E2e tests (`tests/e2e/`) - -- Every e2e test spawns the mcpls binary. If the binary is not pre-built, the test - must be `#[ignore]`. Document the required build step in the ignore message. -- Position values in e2e assertions must use 1-based line/column (MCP convention). - Using 0-based values will produce off-by-one failures that are hard to diagnose. - -### Unit tests - -- `ensure_open` resync path (signature mismatch → `didClose` + `didOpen`) must be - covered by an async test using a temp file and a mock `LspClient`. The stat-only - test (`test_stat_signature_changes_when_file_grows`) does not cover this path. -- `FileWatcher::register` and `compute_changes` require unit tests. New glob - registrations must be verified to actually match the intended file paths using - absolute path strings — bare patterns like `*.rs` will not match. -- `DiagnosticsMode` serde tests must cover all three variants in both JSON and TOML - deserialization, including the omitted-key default. +## Integration and e2e tests + +Tests that require an external binary (language server or the project binary itself) +must be marked `#[ignore = "Requires in PATH"]`. Unmarked tests that shell out +to external processes break CI on clean runners where those binaries are absent. + +Position values in assertions must use 1-based line and column numbers (MCP convention). +Using 0-based values produces off-by-one failures that are difficult to diagnose because +both the test and the production code look correct in isolation. + +## Unit tests + +New async code paths must have async unit tests using real temporary files and a mock +client where network I/O is involved. A test that only verifies the underlying primitive +(e.g. that a filesystem stat changes) does not cover the coordination logic that calls it. + +Glob matching tests must use absolute path strings as inputs. A test that passes a bare +filename to a glob set will pass even when the production code fails on absolute paths, +because `globset` anchoring behaviour differs between the two cases. + +New serialisable config types must have round-trip tests covering every variant in at +least one format. For enums with `#[serde(rename_all)]`, also verify that the +serialised string matches the documented on-disk value. + +## Coverage gaps to watch + +Any code path that involves both a filesystem operation and a subsequent network +notification (open, close, resync) should have a test that exercises the full sequence +with a mock network client, not just the filesystem part in isolation. From 7189f47edcf339cc18514c010d85417f3c71b9cb Mon Sep 17 00:00:00 2001 From: "Andrei G." Date: Wed, 29 Apr 2026 13:44:44 +0200 Subject: [PATCH 3/3] chore: extend Copilot instructions with type safety, idiomatic Rust, and MSRV guidance Add sections on: - Newtype patterns and typestate for domain values - Idiomatic error handling (let-else, ?, avoid unwrap in production) - Lock scope rules and async concurrency patterns (join!, try_send) - API currency: suggest MSRV bumps when newer std APIs cover existing deps - Examples of stable APIs worth adopting (let-else 1.65, OnceLock 1.70, etc.) --- .github/copilot-instructions.md | 111 +++++++++++------ .github/instructions/ci.instructions.md | 36 +++--- .../instructions/mcp-tools.instructions.md | 43 ++++--- .../instructions/rust-bridge.instructions.md | 114 +++++++++--------- .../instructions/rust-config.instructions.md | 42 ++++--- .github/instructions/tests.instructions.md | 49 ++++---- 6 files changed, 230 insertions(+), 165 deletions(-) diff --git a/.github/copilot-instructions.md b/.github/copilot-instructions.md index 65d52a2..c5982f3 100644 --- a/.github/copilot-instructions.md +++ b/.github/copilot-instructions.md @@ -9,51 +9,90 @@ AI Client ←→ [MCP/stdio] ←→ mcpls ←→ [LSP/stdio] ←→ language ser Key crates: `mcpls-core/src/bridge/`, `lsp/`, `mcp/`, `config/`. -## Correctness +## Type safety -**Position encoding** — MCP is 1-based, LSP is 0-based. All line/column conversions must -go through the dedicated encoding module. Direct arithmetic on position values outside -that module is a bug. +Prefer newtypes over primitive aliases for domain values. A raw `u32` for a line number +and a raw `u32` for a column are indistinguishable to the compiler; a `Line(u32)` and +`Column(u32)` make transpositions a compile error. -**LSP JSON-RPC message classification** — four shapes exist: -- `method` + `id` → server-to-client request (must be replied to) -- `method` only → notification (fire-and-forget) -- `id` + `result` or `error` → response -- anything else → protocol error +Encode protocol state in the type system where possible. A document that has been opened +and one that has not should ideally be different types, not the same type with a boolean +field. Typestate prevents calling operations that are only valid on open documents. -A message with `id` but no `method`, `result`, or `error` is invalid and must not be -silently treated as a successful `null` response. +Avoid stringly-typed dispatch. Matching on `&str` method names to branch protocol +behaviour should be replaced with a typed enum as soon as the set of methods is known +and bounded. -**Document version monotonicity** — `textDocument/didOpen` version numbers must strictly -increase per URI across the document's lifetime. On counter overflow, reset to 1 rather -than saturating — after `didClose`/`didOpen` the server treats the document as fresh -regardless of the version number. +`Option` and `Result` must not be unwrapped with `.unwrap()` or `.expect()` in +production code paths. Use `?`, `if let`, `let-else`, or explicit error mapping. -**Filesystem signature freshness** — when caching file state with `(mtime, size)`, stat -and read must be bound to the same moment. A stat before the read creates a TOCTOU -window: on filesystems with low mtime resolution a same-size rewrite within a single -clock tick leaves the signature unchanged while content has changed. +## Idiomatic Rust -## Concurrency +Prefer iterator adapters over manual loops. `map`, `filter`, `flat_map`, `collect`, and +`fold` express intent more clearly and are easier to compose than `for` with a mutable +accumulator. -**Lock granularity** — a mutex that guards a large struct must not be held across async -I/O. If a background task and a request handler share the same lock, and the request -handler awaits a network round-trip while holding it, the background task stalls. When -the background channel fills, the network response can no longer be delivered — -head-of-line deadlock. Prefer fine-grained locks scoped to the data they protect. +Avoid unnecessary heap allocation. Prefer `&str` over `String` in function signatures +where ownership is not required. Prefer `Cow` when a function sometimes borrows and +sometimes owns. -**Channel drop-on-full semantics** — `std::sync::mpsc::SyncSender::send` blocks when -the channel is full; it does not drop. Use `try_send` when the intent is to discard -events under backpressure and never block the sender thread. +Use `let-else` (stabilised in Rust 1.65) to reduce nesting when early return on `None` +or `Err` is needed: +```rust +// prefer +let Some(value) = option else { return Err(...) }; +// over +let value = match option { Some(v) => v, None => return Err(...) }; +``` + +Derive `Clone`, `Debug`, `PartialEq` on data types unless there is a specific reason not +to. Their absence makes testing harder and the omission is rarely intentional. + +## Architecture + +A component that bridges two protocols (MCP and LSP) must not let either protocol's +failure mode affect the other. LSP request timeouts must not stall MCP response +delivery, and LSP push notifications must not block MCP request handling. + +Shared mutable state must be scoped as narrowly as possible. A `Mutex` that guards a +large composite struct is a concurrency bottleneck; prefer separate locks for +independent sub-components (e.g. a cache and a client are independent). + +A `Mutex` guard must never be held across an `.await` point that involves I/O. Doing so +holds the lock for the full duration of the network round-trip, blocking every other +task that needs the lock. Extract the guarded value before the `.await` or restructure +so the lock is released first. + +## Async + +Prefer `tokio::join!` or `tokio::try_join!` over sequential `.await` chains when +operations are independent. Sequential awaits serialise work that could run concurrently. + +Blocking operations (`std::fs`, `std::thread::sleep`, CPU-heavy loops) must not run on +the async executor. Use `tokio::fs`, `tokio::time::sleep`, or `tokio::task::spawn_blocking` +for work that would block a task for more than a few microseconds. + +Use `tokio::sync::mpsc` for async-to-async channels. Use `std::sync::mpsc::sync_channel` +only for bridging a synchronous context to async, and always with `try_send` when the +intent is to drop events under backpressure — `send` blocks on a full channel. + +Prefer `tokio::select!` with a cancellation token over bare `loop { recv().await }` +for background tasks that must shut down cleanly. -## Dependencies +## API currency and MSRV -When a new crate is introduced, verify its license is in the `cargo deny` allow list. -A missing license will fail `cargo deny check licenses` in CI on the first run. +When reviewing code, check whether newer stable Rust APIs would simplify it. If a +simpler or safer API was stabilised after the current `rust-version` in `Cargo.toml`, +suggest bumping MSRV and using the new API. Document the bump in CHANGELOG under +`### Changed`. -## Style +Examples of APIs worth adopting when MSRV permits: +- `let-else` expressions (1.65) — replace verbose `match`/`unwrap_or_else` for early return +- `std::io::read_to_string` on a `File` handle (1.0, but pairing with `.metadata()` on the same + handle closes TOCTOU windows — prefer over separate `stat` + `open`) +- `OnceLock` (1.70) — prefer over `lazy_static` or `once_cell` for static initialisation +- `is_some_and` / `is_none_or` (1.70) — replace `map(|v| ...).unwrap_or(false)` patterns +- `Iterator::array_chunks` (nightly → stable tracking) — watch stabilisation status -- All `pub` items must have `///` doc comments explaining *what* and *why*. -- No `unsafe` code — `deny(unsafe_code)` is enforced workspace-wide. -- Public enums that may gain variants must be marked `#[non_exhaustive]`, and breaking - additions must be documented under `### Changed` in CHANGELOG. +When a dependency provides a feature already in `std`, prefer `std`. Fewer dependencies +reduce supply-chain risk and compile time. diff --git a/.github/instructions/ci.instructions.md b/.github/instructions/ci.instructions.md index b82645c..c695a38 100644 --- a/.github/instructions/ci.instructions.md +++ b/.github/instructions/ci.instructions.md @@ -2,26 +2,32 @@ applyTo: ".github/workflows/**,deny.toml,.cargo/**" --- -## Workflow correctness +## Feature flag parity -Feature flag sets in CI jobs must match the pre-commit commands documented in -`CLAUDE.md` exactly. Divergence causes "passes locally, fails CI" failures that are -hard to diagnose — the developer sees green locally and red in CI for the same code. +Feature flag sets in CI jobs must match the pre-commit commands in `CLAUDE.md` exactly. +Divergence causes "passes locally, fails CI" failures where neither side is obviously +wrong. The rustdoc job must set `RUSTDOCFLAGS="-D warnings"`. Without it, broken intra-doc -links and undocumented public items pass silently in CI while failing for downstream -users who build with warnings-as-errors. +links and missing `///` on public items pass silently in CI. -Nightly toolchain is required only for `cargo fmt`. All other jobs must use stable. -Pinning nightly globally causes unnecessary pipeline failures when nightly introduces -a regression unrelated to this project. +Nightly toolchain is required only for `cargo fmt`. All other jobs must use stable to +avoid failures caused by nightly regressions unrelated to this project. -## Dependency policy +## Dependency and license policy -New dependencies must have their license added to the `deny.toml` allow list in the -same PR. Do not merge a PR that introduces a new crate without verifying that -`cargo deny check licenses` passes. License checks fail fast and block the entire -pipeline. +New dependencies require a license check in the same PR. `cargo deny check licenses` +fails fast and blocks the pipeline — do not merge before it passes. `cargo deny check advisories` must be clean. Suppress an advisory only with an explicit -`ignore` entry and a comment explaining why the vulnerable code path is not reachable. +`ignore` entry and a comment explaining why the vulnerable code path is unreachable. + +When a newer stable `std` API covers functionality provided by a dependency, suggest +removing the dependency and bumping `rust-version` instead. Fewer dependencies reduce +compile time and supply-chain risk. Document the MSRV bump in CHANGELOG. + +## MSRV tracking + +`rust-version` in `Cargo.toml` is the enforced minimum. The CI matrix must include a +job that builds and tests on exactly that version to catch accidental use of newer APIs. +Without such a job, MSRV guarantees are not verifiable. diff --git a/.github/instructions/mcp-tools.instructions.md b/.github/instructions/mcp-tools.instructions.md index 6e63abf..adbab6f 100644 --- a/.github/instructions/mcp-tools.instructions.md +++ b/.github/instructions/mcp-tools.instructions.md @@ -2,28 +2,37 @@ applyTo: "crates/mcpls-core/src/mcp/**" --- -## Tool parameter types +## Type safety -Every position parameter must be documented as 1-based in both the doc comment and the -schema description. The MCP↔LSP coordinate conversion happens in the bridge layer, not -in the tool definition — the doc comment is the contract visible to AI clients. +Tool parameter structs must use domain types, not raw primitives, for values with +protocol-specific conventions. Position fields (line, column) must be documented as +1-based in both the `///` doc comment and the `#[schemars(description)]` — the MCP↔LSP +coordinate conversion happens in the bridge layer, not here. -Path parameters must accept absolute paths. State this explicitly in the description so -clients do not pass relative paths that silently resolve to wrong locations. +Path parameters must accept absolute paths. State this in the schema description; AI +clients use it to form requests correctly. New tool structs require `#[derive(JsonSchema)]`. Omitting it silently removes the tool -from the schema exposed to clients — callers cannot discover or invoke it. +from the schema exposed to clients — they cannot discover or invoke it. -## Tool dispatch +## Error handling -Every tool handler must ensure the target document is open and in sync with disk before -issuing an LSP request. Skipping the open step returns stale results when the file was -modified externally since the last access. +Tool handlers must return structured MCP error responses on failure, never panic. +`unwrap()` and `expect()` in handler code kill the entire server process and drop all +active client sessions. Use `?` with proper error mapping instead. -Errors must propagate as structured MCP error responses. `unwrap()` and `expect()` are -not acceptable in handler code — a panic kills the entire server process and drops all -active sessions. +All error variants must carry enough context for the caller to understand what went +wrong. A bare `"internal error"` string is not actionable. Include the file path, +method name, or other relevant detail. -When adding a new tool, register it in both the tool list (capability advertisement) and -the dispatch match arm. A tool present in only one of the two will either never be -discoverable by clients or panic at runtime when called. +## Idiomatic dispatch + +When adding a new tool, register it in both the capability advertisement (tool list) +and the dispatch match arm. A tool missing from the list is undiscoverable; a tool +missing from dispatch panics at runtime. A compile-time check (e.g. a const assertion +or a test that calls every listed tool name through dispatch) is preferable to relying +on manual review. + +Prefer exhaustive match arms over wildcard catch-alls in dispatch. A wildcard silently +swallows unhandled tool names; exhaustive matching makes unregistered names a compile +error. diff --git a/.github/instructions/rust-bridge.instructions.md b/.github/instructions/rust-bridge.instructions.md index 97517a5..5bf42a0 100644 --- a/.github/instructions/rust-bridge.instructions.md +++ b/.github/instructions/rust-bridge.instructions.md @@ -2,79 +2,79 @@ applyTo: "crates/mcpls-core/src/bridge/**,crates/mcpls-core/src/lsp/**,crates/mcpls-core/src/lib.rs" --- -## Document state +## Type safety + +Position types (line, column) must be distinct newtypes, not bare `u32`. The MCP +convention (1-based) and the LSP convention (0-based) must be expressed in the type +system so that passing an MCP position directly to an LSP call is a compile error. -Lazy document opening must re-verify on-disk state on every access, not only on first -open. Cache a filesystem signature (e.g. `mtime` + file size) alongside the document -state and re-read the file when the signature changes. On mismatch, send `didClose` -then `didOpen` with a bumped version before serving the request. +Protocol message variants must be an enum, not a string-matched dispatch. Matching on +`"textDocument/hover"` strings is fragile; a closed enum of known methods makes +unhandled cases visible at compile time and exhaustiveness-checked. -Stat and read must be bound to the same file handle to avoid TOCTOU. Performing a stat, -then separately opening the file for reading, creates a window where an atomic replace -(`rename`) can change the content without changing the signature. +`InboundMessage` and similar enums that may gain variants in future protocol versions +must be `#[non_exhaustive]` so downstream crates are not broken by additions. -Version numbers must remain strictly monotone per URI. On counter overflow, reset to 1 -— after `didClose`/`didOpen` the server treats the document as a fresh open regardless -of the version value. +## Document state -## Diagnostics +Caching document content alongside a filesystem signature (`mtime` + size) prevents +stale reads after external edits. On every access, stat the file and compare against +the cached signature; re-read and re-notify the LSP server on mismatch. -When a feature can source results from multiple channels (e.g. pull request and push -cache), the channels should be fetched concurrently rather than sequentially. Blocking -a cheap synchronous lookup behind a slow network round-trip wastes time. +Stat and read must use the same file handle to close the TOCTOU window. Open the file +once with `tokio::fs::File::open`, call `.metadata().await` on the handle, then read +through the same handle. A separate `tokio::fs::metadata` call before `read_to_string` +leaves a window where an atomic `rename` can change the file between the two calls. -An empty result must be distinguishable from an absent result. Returning an empty -collection for "server has not yet analysed this file" is indistinguishable from "no -errors found" to callers. Either surface a cache-miss indicator or fall back to a -live request on cache miss. +Document version numbers must be strictly monotone per URI. On overflow, reset to 1 +rather than saturating — after `didClose`/`didOpen` the server treats the document as +fresh and the version value is irrelevant to continuity. -Deduplication keys should prefer structured fields (e.g. error code) over free-form -message text. When a code is present, key on `(range, severity, code)`. Fall back to -a normalised message only when no code exists. Document the actual key in the doc -comment — mismatches between the comment and the implementation cause silent bugs. +## Diagnostics pipeline -## Notification pump +When a feature draws from multiple independent sources (pull request + push cache), +fetch them concurrently with `tokio::try_join!` or `tokio::join!`. Awaiting them +sequentially serialises work that has no data dependency. -A background notification pump must hold only the narrowest lock needed to write to -the cache. It must not share a lock with request handlers that hold it across network -I/O. If pump and handler compete for the same mutex, the handler's in-flight network -response can be blocked from arriving — the system deadlocks until the request timeout -fires. +A cache miss must be distinguishable from an empty result. Returning `Vec::new()` for +both "no errors" and "not yet analysed" is ambiguous to callers. Use an `Option` +or a dedicated status type to express the difference. -## File watcher +Deduplication keys should prefer structured fields (error code) over free-form message +text. When a code is present, key on `(range, severity, code)`. The doc comment must +describe the actual key — a mismatch between the comment and the implementation causes +silent behavioural bugs. -Glob patterns from LSP registrations must be matched against full absolute paths. -`globset` does not implicitly anchor bare patterns to any directory. A pattern like -`*.rs` matches only a filename with no path component. Prepend `**/` to bare patterns -before compiling them into a glob set. Example: +## Concurrency and lock scope -```rust -// wrong: "*.rs" won't match "/repo/src/lib.rs" -// correct: "**/*.rs" matches any .rs file at any depth -let anchored = if !pattern.starts_with('/') && !pattern.starts_with("**/") { - format!("**/{pattern}") -} else { - pattern.to_owned() -}; -``` +Background notification pumps must hold only the narrowest lock needed (e.g. a +`Mutex`, not a `Mutex`). A pump that shares a lock with request +handlers will stall when a handler holds the lock across a long-running await, causing +the notification channel to fill and the in-flight response to be undeliverable. + +Never hold a `MutexGuard` across an `.await`. Extract needed values before the await, +or restructure so the guard is dropped before any I/O begins. + +## File watcher -Event filtering (ignoring build artifacts, VCS directories, etc.) should happen before -events enter a bounded channel, not after. Filtering after send wastes channel capacity -and risks stalling the OS watcher thread if the channel fills. +Glob patterns from external registrations must be matched against full absolute paths. +`globset` does not implicitly anchor bare patterns to a directory prefix — a pattern +like `*.rs` only matches a bare filename. Prepend `**/` to patterns that lack a leading +`/` or `**/` before compiling them into a `GlobSet`. -`SyncSender::send` blocks when the channel is full — it does not drop. Use `try_send` -when the intent is to discard events under backpressure. +Noise filtering (build artifacts, VCS directories) must run before events enter any +bounded channel, not after. Filtering post-send wastes channel capacity and can stall +the OS watcher thread when the channel fills during high-churn builds. -Duplicate `registerCapability` IDs must be handled explicitly. The LSP spec treats -re-registration as an error; silently overwriting the old registration loses the -previous glob set without notice. +Use `try_send` for the handoff from the OS watcher callback to the async processing +loop. The watcher callback runs on a system thread; blocking it with `send` on a full +channel stalls the OS-level event delivery. ## LSP client -Server-to-client requests (messages with both `method` and `id`) must receive a -JSON-RPC response. Unhandled methods must return error code `-32601` (Method Not Found) -— dropping them silently causes the server to wait indefinitely for an acknowledgement. +Server-to-client requests (messages with both `method` and `id`) require a JSON-RPC +response. Unhandled methods must return error code `-32601` (Method Not Found). Dropping +them silently causes the LSP server to wait indefinitely for an acknowledgement. -Response classification must require either `result` or `error` to be present. A -message with only `id` and no `method`, `result`, or `error` is a protocol error and -must not be coerced into a successful `null` response. +`id`-only messages (no `method`, `result`, or `error`) are protocol errors and must not +be coerced into a successful `null` response. diff --git a/.github/instructions/rust-config.instructions.md b/.github/instructions/rust-config.instructions.md index 78a1f24..d14292b 100644 --- a/.github/instructions/rust-config.instructions.md +++ b/.github/instructions/rust-config.instructions.md @@ -2,29 +2,33 @@ applyTo: "crates/mcpls-core/src/config/**,deny.toml,Cargo.toml,crates/*/Cargo.toml" --- -## Configuration +## Type safety and idiomatic config -New `#[serde(default)]` fields must have a test that deserialises a config with the -field omitted and asserts the default value. Both JSON and TOML paths should be covered -— `#[serde(rename_all)]` applies uniformly, but TOML and JSON parsers can behave -differently for edge cases like empty strings and null values. +Config option enums must derive `Default` with `#[default]` on the intended default +variant rather than implementing `Default` manually. This keeps the default co-located +with the type definition and is less error-prone. + +Boolean config fields that represent a choice between more than two states should be +`enum`, not `bool`. A `bool` cannot be extended without a breaking change; an enum can +gain variants under `#[non_exhaustive]`. -Enum config options must include a round-trip test for every variant covering at least -one serialisation format. A rename or typo in `#[serde(rename_all = "...")]` silently -changes the on-disk format and breaks existing configs. +New `#[serde(default)]` fields must have a test that deserialises a config with the +field omitted and asserts the default value. Cover both JSON and TOML — parsers can +behave differently on edge cases like empty strings and absent vs. null keys. -## Dependencies +Enum config options must have a round-trip test for every variant. A typo in +`#[serde(rename_all = "...")]` silently changes the on-disk format and breaks existing +user configs without a compile error. -When a new crate is introduced, check that its license appears in the `deny.toml` allow -list before merging. `cargo deny check licenses` fails fast in CI and blocks the entire -pipeline. Add the license in the same PR that adds the crate. +## MSRV and dependencies -`cargo deny check advisories` must be clean. A RUSTSEC advisory in a transitive -dependency is a blocker even when the vulnerable code path is not exercised by this -project. +When introducing a new dependency, check whether the functionality is already available +in `std` for the current MSRV. Prefer `std` over external crates to reduce +compile time and supply-chain surface. -## MSRV +New crates require a license check in the same PR. Add the license to `deny.toml` +before merging — `cargo deny check licenses` fails fast and blocks the pipeline. -`rust-version` in `Cargo.toml` is the enforced minimum supported Rust version. Any -stabilised API used in new code must have been available since that version. If a newer -API is needed, bump `rust-version` and document the change in CHANGELOG. +If a new stable Rust API (e.g. `OnceLock` at 1.70 replacing `once_cell::sync::OnceCell`) +makes an existing dependency redundant, suggest removing the dependency and bumping +`rust-version` instead. Document the bump in CHANGELOG. diff --git a/.github/instructions/tests.instructions.md b/.github/instructions/tests.instructions.md index 20603de..18703ac 100644 --- a/.github/instructions/tests.instructions.md +++ b/.github/instructions/tests.instructions.md @@ -2,32 +2,39 @@ applyTo: "crates/mcpls-core/tests/**,crates/mcpls-core/src/**/tests.rs,crates/mcpls-core/src/**/*_tests.rs" --- -## Integration and e2e tests +## Test design -Tests that require an external binary (language server or the project binary itself) -must be marked `#[ignore = "Requires in PATH"]`. Unmarked tests that shell out -to external processes break CI on clean runners where those binaries are absent. +Tests for async code must use `#[tokio::test]` and exercise the actual async paths, not +synchronous wrappers that bypass scheduler interactions. Concurrency bugs (lock +contention, channel saturation) are only reachable through genuine async execution. -Position values in assertions must use 1-based line and column numbers (MCP convention). -Using 0-based values produces off-by-one failures that are difficult to diagnose because -both the test and the production code look correct in isolation. +Tests that require an external binary must be marked `#[ignore = "Requires +in PATH"]`. CI runners do not have language server binaries; an unmarked test will fail +on every clean run. -## Unit tests +## Correctness invariants to cover -New async code paths must have async unit tests using real temporary files and a mock -client where network I/O is involved. A test that only verifies the underlying primitive -(e.g. that a filesystem stat changes) does not cover the coordination logic that calls it. +Any code path that pairs a filesystem operation with a subsequent network notification +(open, close, re-open after external edit) must have a test that exercises the full +sequence using a mock network client and a real temporary file — not just the filesystem +primitive in isolation. -Glob matching tests must use absolute path strings as inputs. A test that passes a bare -filename to a glob set will pass even when the production code fails on absolute paths, -because `globset` anchoring behaviour differs between the two cases. +Glob matching tests must use absolute path strings as input. Passing a bare filename +to a glob set will pass even when the production code fails on absolute paths, because +`globset` anchoring behaviour differs between the two cases. -New serialisable config types must have round-trip tests covering every variant in at -least one format. For enums with `#[serde(rename_all)]`, also verify that the -serialised string matches the documented on-disk value. +Round-trip tests for serialisable types must cover every variant, and must assert the +serialised string value, not just that deserialisation succeeds. A `rename_all` typo +produces a wrong on-disk format that round-trips correctly in isolation. -## Coverage gaps to watch +## Idiomatic test code -Any code path that involves both a filesystem operation and a subsequent network -notification (open, close, resync) should have a test that exercises the full sequence -with a mock network client, not just the filesystem part in isolation. +Prefer `assert_eq!` over `assert!(a == b)` — `assert_eq!` prints both values on +failure, making diagnosis faster. + +Use `tempfile::tempdir()` for tests that need real filesystem paths. Hard-coded `/tmp` +paths cause test interference when multiple test threads run in parallel. + +Prefer `#[tokio::test]` over `tokio::runtime::Runtime::block_on` in test bodies — +`block_on` does not integrate with tokio's test utilities (`time::pause`, +`time::advance`) needed for testing debounce and timeout logic.