diff --git a/.github/copilot-instructions.md b/.github/copilot-instructions.md new file mode 100644 index 0000000..c5982f3 --- /dev/null +++ b/.github/copilot-instructions.md @@ -0,0 +1,98 @@ +# Copilot Instructions for mcpls + +mcpls is a bridge between MCP (Model Context Protocol) and LSP (Language Server Protocol). +AI clients speak MCP to mcpls; mcpls spawns and communicates with language servers over LSP. + +``` +AI Client ←→ [MCP/stdio] ←→ mcpls ←→ [LSP/stdio] ←→ language server +``` + +Key crates: `mcpls-core/src/bridge/`, `lsp/`, `mcp/`, `config/`. + +## Type safety + +Prefer newtypes over primitive aliases for domain values. A raw `u32` for a line number +and a raw `u32` for a column are indistinguishable to the compiler; a `Line(u32)` and +`Column(u32)` make transpositions a compile error. + +Encode protocol state in the type system where possible. A document that has been opened +and one that has not should ideally be different types, not the same type with a boolean +field. Typestate prevents calling operations that are only valid on open documents. + +Avoid stringly-typed dispatch. Matching on `&str` method names to branch protocol +behaviour should be replaced with a typed enum as soon as the set of methods is known +and bounded. + +`Option` and `Result` must not be unwrapped with `.unwrap()` or `.expect()` in +production code paths. Use `?`, `if let`, `let-else`, or explicit error mapping. + +## Idiomatic Rust + +Prefer iterator adapters over manual loops. `map`, `filter`, `flat_map`, `collect`, and +`fold` express intent more clearly and are easier to compose than `for` with a mutable +accumulator. + +Avoid unnecessary heap allocation. Prefer `&str` over `String` in function signatures +where ownership is not required. Prefer `Cow` when a function sometimes borrows and +sometimes owns. + +Use `let-else` (stabilised in Rust 1.65) to reduce nesting when early return on `None` +or `Err` is needed: +```rust +// prefer +let Some(value) = option else { return Err(...) }; +// over +let value = match option { Some(v) => v, None => return Err(...) }; +``` + +Derive `Clone`, `Debug`, `PartialEq` on data types unless there is a specific reason not +to. Their absence makes testing harder and the omission is rarely intentional. + +## Architecture + +A component that bridges two protocols (MCP and LSP) must not let either protocol's +failure mode affect the other. LSP request timeouts must not stall MCP response +delivery, and LSP push notifications must not block MCP request handling. + +Shared mutable state must be scoped as narrowly as possible. A `Mutex` that guards a +large composite struct is a concurrency bottleneck; prefer separate locks for +independent sub-components (e.g. a cache and a client are independent). + +A `Mutex` guard must never be held across an `.await` point that involves I/O. Doing so +holds the lock for the full duration of the network round-trip, blocking every other +task that needs the lock. Extract the guarded value before the `.await` or restructure +so the lock is released first. + +## Async + +Prefer `tokio::join!` or `tokio::try_join!` over sequential `.await` chains when +operations are independent. Sequential awaits serialise work that could run concurrently. + +Blocking operations (`std::fs`, `std::thread::sleep`, CPU-heavy loops) must not run on +the async executor. Use `tokio::fs`, `tokio::time::sleep`, or `tokio::task::spawn_blocking` +for work that would block a task for more than a few microseconds. + +Use `tokio::sync::mpsc` for async-to-async channels. Use `std::sync::mpsc::sync_channel` +only for bridging a synchronous context to async, and always with `try_send` when the +intent is to drop events under backpressure — `send` blocks on a full channel. + +Prefer `tokio::select!` with a cancellation token over bare `loop { recv().await }` +for background tasks that must shut down cleanly. + +## API currency and MSRV + +When reviewing code, check whether newer stable Rust APIs would simplify it. If a +simpler or safer API was stabilised after the current `rust-version` in `Cargo.toml`, +suggest bumping MSRV and using the new API. Document the bump in CHANGELOG under +`### Changed`. + +Examples of APIs worth adopting when MSRV permits: +- `let-else` expressions (1.65) — replace verbose `match`/`unwrap_or_else` for early return +- `std::io::read_to_string` on a `File` handle (1.0, but pairing with `.metadata()` on the same + handle closes TOCTOU windows — prefer over separate `stat` + `open`) +- `OnceLock` (1.70) — prefer over `lazy_static` or `once_cell` for static initialisation +- `is_some_and` / `is_none_or` (1.70) — replace `map(|v| ...).unwrap_or(false)` patterns +- `Iterator::array_chunks` (nightly → stable tracking) — watch stabilisation status + +When a dependency provides a feature already in `std`, prefer `std`. Fewer dependencies +reduce supply-chain risk and compile time. diff --git a/.github/instructions/changelog.instructions.md b/.github/instructions/changelog.instructions.md new file mode 100644 index 0000000..42d7133 --- /dev/null +++ b/.github/instructions/changelog.instructions.md @@ -0,0 +1,18 @@ +--- +applyTo: "CHANGELOG.md" +--- + +## Changelog format + +Breaking changes to public APIs must appear under `### Changed` with an explicit +"Breaking change:" prefix. A `### Fixed` entry alone is not sufficient — downstream +users scanning the changelog for breakage will miss it. + +Additions of `#[non_exhaustive]` to existing public enums or structs are breaking +changes for crates that match exhaustively. Document the required migration (add a +wildcard arm, update struct initialisation) in the entry. + +Every entry must reference the PR or issue number: `(#NNN)`. + +New entries go under `## [Unreleased]`. Do not add entries directly under a versioned +section — version assignment happens in the release PR. diff --git a/.github/instructions/ci.instructions.md b/.github/instructions/ci.instructions.md new file mode 100644 index 0000000..c695a38 --- /dev/null +++ b/.github/instructions/ci.instructions.md @@ -0,0 +1,33 @@ +--- +applyTo: ".github/workflows/**,deny.toml,.cargo/**" +--- + +## Feature flag parity + +Feature flag sets in CI jobs must match the pre-commit commands in `CLAUDE.md` exactly. +Divergence causes "passes locally, fails CI" failures where neither side is obviously +wrong. + +The rustdoc job must set `RUSTDOCFLAGS="-D warnings"`. Without it, broken intra-doc +links and missing `///` on public items pass silently in CI. + +Nightly toolchain is required only for `cargo fmt`. All other jobs must use stable to +avoid failures caused by nightly regressions unrelated to this project. + +## Dependency and license policy + +New dependencies require a license check in the same PR. `cargo deny check licenses` +fails fast and blocks the pipeline — do not merge before it passes. + +`cargo deny check advisories` must be clean. Suppress an advisory only with an explicit +`ignore` entry and a comment explaining why the vulnerable code path is unreachable. + +When a newer stable `std` API covers functionality provided by a dependency, suggest +removing the dependency and bumping `rust-version` instead. Fewer dependencies reduce +compile time and supply-chain risk. Document the MSRV bump in CHANGELOG. + +## MSRV tracking + +`rust-version` in `Cargo.toml` is the enforced minimum. The CI matrix must include a +job that builds and tests on exactly that version to catch accidental use of newer APIs. +Without such a job, MSRV guarantees are not verifiable. diff --git a/.github/instructions/mcp-tools.instructions.md b/.github/instructions/mcp-tools.instructions.md new file mode 100644 index 0000000..adbab6f --- /dev/null +++ b/.github/instructions/mcp-tools.instructions.md @@ -0,0 +1,38 @@ +--- +applyTo: "crates/mcpls-core/src/mcp/**" +--- + +## Type safety + +Tool parameter structs must use domain types, not raw primitives, for values with +protocol-specific conventions. Position fields (line, column) must be documented as +1-based in both the `///` doc comment and the `#[schemars(description)]` — the MCP↔LSP +coordinate conversion happens in the bridge layer, not here. + +Path parameters must accept absolute paths. State this in the schema description; AI +clients use it to form requests correctly. + +New tool structs require `#[derive(JsonSchema)]`. Omitting it silently removes the tool +from the schema exposed to clients — they cannot discover or invoke it. + +## Error handling + +Tool handlers must return structured MCP error responses on failure, never panic. +`unwrap()` and `expect()` in handler code kill the entire server process and drop all +active client sessions. Use `?` with proper error mapping instead. + +All error variants must carry enough context for the caller to understand what went +wrong. A bare `"internal error"` string is not actionable. Include the file path, +method name, or other relevant detail. + +## Idiomatic dispatch + +When adding a new tool, register it in both the capability advertisement (tool list) +and the dispatch match arm. A tool missing from the list is undiscoverable; a tool +missing from dispatch panics at runtime. A compile-time check (e.g. a const assertion +or a test that calls every listed tool name through dispatch) is preferable to relying +on manual review. + +Prefer exhaustive match arms over wildcard catch-alls in dispatch. A wildcard silently +swallows unhandled tool names; exhaustive matching makes unregistered names a compile +error. diff --git a/.github/instructions/rust-bridge.instructions.md b/.github/instructions/rust-bridge.instructions.md new file mode 100644 index 0000000..5bf42a0 --- /dev/null +++ b/.github/instructions/rust-bridge.instructions.md @@ -0,0 +1,80 @@ +--- +applyTo: "crates/mcpls-core/src/bridge/**,crates/mcpls-core/src/lsp/**,crates/mcpls-core/src/lib.rs" +--- + +## Type safety + +Position types (line, column) must be distinct newtypes, not bare `u32`. The MCP +convention (1-based) and the LSP convention (0-based) must be expressed in the type +system so that passing an MCP position directly to an LSP call is a compile error. + +Protocol message variants must be an enum, not a string-matched dispatch. Matching on +`"textDocument/hover"` strings is fragile; a closed enum of known methods makes +unhandled cases visible at compile time and exhaustiveness-checked. + +`InboundMessage` and similar enums that may gain variants in future protocol versions +must be `#[non_exhaustive]` so downstream crates are not broken by additions. + +## Document state + +Caching document content alongside a filesystem signature (`mtime` + size) prevents +stale reads after external edits. On every access, stat the file and compare against +the cached signature; re-read and re-notify the LSP server on mismatch. + +Stat and read must use the same file handle to close the TOCTOU window. Open the file +once with `tokio::fs::File::open`, call `.metadata().await` on the handle, then read +through the same handle. A separate `tokio::fs::metadata` call before `read_to_string` +leaves a window where an atomic `rename` can change the file between the two calls. + +Document version numbers must be strictly monotone per URI. On overflow, reset to 1 +rather than saturating — after `didClose`/`didOpen` the server treats the document as +fresh and the version value is irrelevant to continuity. + +## Diagnostics pipeline + +When a feature draws from multiple independent sources (pull request + push cache), +fetch them concurrently with `tokio::try_join!` or `tokio::join!`. Awaiting them +sequentially serialises work that has no data dependency. + +A cache miss must be distinguishable from an empty result. Returning `Vec::new()` for +both "no errors" and "not yet analysed" is ambiguous to callers. Use an `Option` +or a dedicated status type to express the difference. + +Deduplication keys should prefer structured fields (error code) over free-form message +text. When a code is present, key on `(range, severity, code)`. The doc comment must +describe the actual key — a mismatch between the comment and the implementation causes +silent behavioural bugs. + +## Concurrency and lock scope + +Background notification pumps must hold only the narrowest lock needed (e.g. a +`Mutex`, not a `Mutex`). A pump that shares a lock with request +handlers will stall when a handler holds the lock across a long-running await, causing +the notification channel to fill and the in-flight response to be undeliverable. + +Never hold a `MutexGuard` across an `.await`. Extract needed values before the await, +or restructure so the guard is dropped before any I/O begins. + +## File watcher + +Glob patterns from external registrations must be matched against full absolute paths. +`globset` does not implicitly anchor bare patterns to a directory prefix — a pattern +like `*.rs` only matches a bare filename. Prepend `**/` to patterns that lack a leading +`/` or `**/` before compiling them into a `GlobSet`. + +Noise filtering (build artifacts, VCS directories) must run before events enter any +bounded channel, not after. Filtering post-send wastes channel capacity and can stall +the OS watcher thread when the channel fills during high-churn builds. + +Use `try_send` for the handoff from the OS watcher callback to the async processing +loop. The watcher callback runs on a system thread; blocking it with `send` on a full +channel stalls the OS-level event delivery. + +## LSP client + +Server-to-client requests (messages with both `method` and `id`) require a JSON-RPC +response. Unhandled methods must return error code `-32601` (Method Not Found). Dropping +them silently causes the LSP server to wait indefinitely for an acknowledgement. + +`id`-only messages (no `method`, `result`, or `error`) are protocol errors and must not +be coerced into a successful `null` response. diff --git a/.github/instructions/rust-config.instructions.md b/.github/instructions/rust-config.instructions.md new file mode 100644 index 0000000..d14292b --- /dev/null +++ b/.github/instructions/rust-config.instructions.md @@ -0,0 +1,34 @@ +--- +applyTo: "crates/mcpls-core/src/config/**,deny.toml,Cargo.toml,crates/*/Cargo.toml" +--- + +## Type safety and idiomatic config + +Config option enums must derive `Default` with `#[default]` on the intended default +variant rather than implementing `Default` manually. This keeps the default co-located +with the type definition and is less error-prone. + +Boolean config fields that represent a choice between more than two states should be +`enum`, not `bool`. A `bool` cannot be extended without a breaking change; an enum can +gain variants under `#[non_exhaustive]`. + +New `#[serde(default)]` fields must have a test that deserialises a config with the +field omitted and asserts the default value. Cover both JSON and TOML — parsers can +behave differently on edge cases like empty strings and absent vs. null keys. + +Enum config options must have a round-trip test for every variant. A typo in +`#[serde(rename_all = "...")]` silently changes the on-disk format and breaks existing +user configs without a compile error. + +## MSRV and dependencies + +When introducing a new dependency, check whether the functionality is already available +in `std` for the current MSRV. Prefer `std` over external crates to reduce +compile time and supply-chain surface. + +New crates require a license check in the same PR. Add the license to `deny.toml` +before merging — `cargo deny check licenses` fails fast and blocks the pipeline. + +If a new stable Rust API (e.g. `OnceLock` at 1.70 replacing `once_cell::sync::OnceCell`) +makes an existing dependency redundant, suggest removing the dependency and bumping +`rust-version` instead. Document the bump in CHANGELOG. diff --git a/.github/instructions/tests.instructions.md b/.github/instructions/tests.instructions.md new file mode 100644 index 0000000..18703ac --- /dev/null +++ b/.github/instructions/tests.instructions.md @@ -0,0 +1,40 @@ +--- +applyTo: "crates/mcpls-core/tests/**,crates/mcpls-core/src/**/tests.rs,crates/mcpls-core/src/**/*_tests.rs" +--- + +## Test design + +Tests for async code must use `#[tokio::test]` and exercise the actual async paths, not +synchronous wrappers that bypass scheduler interactions. Concurrency bugs (lock +contention, channel saturation) are only reachable through genuine async execution. + +Tests that require an external binary must be marked `#[ignore = "Requires +in PATH"]`. CI runners do not have language server binaries; an unmarked test will fail +on every clean run. + +## Correctness invariants to cover + +Any code path that pairs a filesystem operation with a subsequent network notification +(open, close, re-open after external edit) must have a test that exercises the full +sequence using a mock network client and a real temporary file — not just the filesystem +primitive in isolation. + +Glob matching tests must use absolute path strings as input. Passing a bare filename +to a glob set will pass even when the production code fails on absolute paths, because +`globset` anchoring behaviour differs between the two cases. + +Round-trip tests for serialisable types must cover every variant, and must assert the +serialised string value, not just that deserialisation succeeds. A `rename_all` typo +produces a wrong on-disk format that round-trips correctly in isolation. + +## Idiomatic test code + +Prefer `assert_eq!` over `assert!(a == b)` — `assert_eq!` prints both values on +failure, making diagnosis faster. + +Use `tempfile::tempdir()` for tests that need real filesystem paths. Hard-coded `/tmp` +paths cause test interference when multiple test threads run in parallel. + +Prefer `#[tokio::test]` over `tokio::runtime::Runtime::block_on` in test bodies — +`block_on` does not integrate with tokio's test utilities (`time::pause`, +`time::advance`) needed for testing debounce and timeout logic.