Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
98 changes: 98 additions & 0 deletions .github/copilot-instructions.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,98 @@
# Copilot Instructions for mcpls

mcpls is a bridge between MCP (Model Context Protocol) and LSP (Language Server Protocol).
AI clients speak MCP to mcpls; mcpls spawns and communicates with language servers over LSP.

```
AI Client ←→ [MCP/stdio] ←→ mcpls ←→ [LSP/stdio] ←→ language server
```

Key crates: `mcpls-core/src/bridge/`, `lsp/`, `mcp/`, `config/`.

## Type safety

Prefer newtypes over primitive aliases for domain values. A raw `u32` for a line number
and a raw `u32` for a column are indistinguishable to the compiler; a `Line(u32)` and
`Column(u32)` make transpositions a compile error.

Encode protocol state in the type system where possible. A document that has been opened
and one that has not should ideally be different types, not the same type with a boolean
field. Typestate prevents calling operations that are only valid on open documents.

Avoid stringly-typed dispatch. Matching on `&str` method names to branch protocol
behaviour should be replaced with a typed enum as soon as the set of methods is known
and bounded.

`Option<T>` and `Result<T, E>` must not be unwrapped with `.unwrap()` or `.expect()` in
production code paths. Use `?`, `if let`, `let-else`, or explicit error mapping.

## Idiomatic Rust

Prefer iterator adapters over manual loops. `map`, `filter`, `flat_map`, `collect`, and
`fold` express intent more clearly and are easier to compose than `for` with a mutable
accumulator.

Avoid unnecessary heap allocation. Prefer `&str` over `String` in function signatures
where ownership is not required. Prefer `Cow<str>` when a function sometimes borrows and
sometimes owns.

Use `let-else` (stabilised in Rust 1.65) to reduce nesting when early return on `None`
or `Err` is needed:
```rust
// prefer
let Some(value) = option else { return Err(...) };
// over
let value = match option { Some(v) => v, None => return Err(...) };
```

Derive `Clone`, `Debug`, `PartialEq` on data types unless there is a specific reason not
to. Their absence makes testing harder and the omission is rarely intentional.

## Architecture

A component that bridges two protocols (MCP and LSP) must not let either protocol's
failure mode affect the other. LSP request timeouts must not stall MCP response
delivery, and LSP push notifications must not block MCP request handling.

Shared mutable state must be scoped as narrowly as possible. A `Mutex` that guards a
large composite struct is a concurrency bottleneck; prefer separate locks for
independent sub-components (e.g. a cache and a client are independent).

A `Mutex` guard must never be held across an `.await` point that involves I/O. Doing so
holds the lock for the full duration of the network round-trip, blocking every other
task that needs the lock. Extract the guarded value before the `.await` or restructure
so the lock is released first.

## Async

Prefer `tokio::join!` or `tokio::try_join!` over sequential `.await` chains when
operations are independent. Sequential awaits serialise work that could run concurrently.

Blocking operations (`std::fs`, `std::thread::sleep`, CPU-heavy loops) must not run on
the async executor. Use `tokio::fs`, `tokio::time::sleep`, or `tokio::task::spawn_blocking`
for work that would block a task for more than a few microseconds.

Use `tokio::sync::mpsc` for async-to-async channels. Use `std::sync::mpsc::sync_channel`
only for bridging a synchronous context to async, and always with `try_send` when the
intent is to drop events under backpressure — `send` blocks on a full channel.

Prefer `tokio::select!` with a cancellation token over bare `loop { recv().await }`
for background tasks that must shut down cleanly.

## API currency and MSRV

When reviewing code, check whether newer stable Rust APIs would simplify it. If a
simpler or safer API was stabilised after the current `rust-version` in `Cargo.toml`,
suggest bumping MSRV and using the new API. Document the bump in CHANGELOG under
`### Changed`.

Examples of APIs worth adopting when MSRV permits:
- `let-else` expressions (1.65) — replace verbose `match`/`unwrap_or_else` for early return
- `std::io::read_to_string` on a `File` handle (1.0, but pairing with `.metadata()` on the same
handle closes TOCTOU windows — prefer over separate `stat` + `open`)
- `OnceLock` (1.70) — prefer over `lazy_static` or `once_cell` for static initialisation
- `is_some_and` / `is_none_or` (1.70) — replace `map(|v| ...).unwrap_or(false)` patterns
- `Iterator::array_chunks` (nightly → stable tracking) — watch stabilisation status

When a dependency provides a feature already in `std`, prefer `std`. Fewer dependencies
reduce supply-chain risk and compile time.
18 changes: 18 additions & 0 deletions .github/instructions/changelog.instructions.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
---
applyTo: "CHANGELOG.md"
---

## Changelog format

Breaking changes to public APIs must appear under `### Changed` with an explicit
"Breaking change:" prefix. A `### Fixed` entry alone is not sufficient — downstream
users scanning the changelog for breakage will miss it.

Additions of `#[non_exhaustive]` to existing public enums or structs are breaking
changes for crates that match exhaustively. Document the required migration (add a
wildcard arm, update struct initialisation) in the entry.

Every entry must reference the PR or issue number: `(#NNN)`.

New entries go under `## [Unreleased]`. Do not add entries directly under a versioned
section — version assignment happens in the release PR.
33 changes: 33 additions & 0 deletions .github/instructions/ci.instructions.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
---
applyTo: ".github/workflows/**,deny.toml,.cargo/**"
---

## Feature flag parity

Feature flag sets in CI jobs must match the pre-commit commands in `CLAUDE.md` exactly.
Divergence causes "passes locally, fails CI" failures where neither side is obviously
wrong.

The rustdoc job must set `RUSTDOCFLAGS="-D warnings"`. Without it, broken intra-doc
links and missing `///` on public items pass silently in CI.

Nightly toolchain is required only for `cargo fmt`. All other jobs must use stable to
avoid failures caused by nightly regressions unrelated to this project.

## Dependency and license policy

New dependencies require a license check in the same PR. `cargo deny check licenses`
fails fast and blocks the pipeline — do not merge before it passes.

`cargo deny check advisories` must be clean. Suppress an advisory only with an explicit
`ignore` entry and a comment explaining why the vulnerable code path is unreachable.

When a newer stable `std` API covers functionality provided by a dependency, suggest
removing the dependency and bumping `rust-version` instead. Fewer dependencies reduce
compile time and supply-chain risk. Document the MSRV bump in CHANGELOG.

## MSRV tracking

`rust-version` in `Cargo.toml` is the enforced minimum. The CI matrix must include a
job that builds and tests on exactly that version to catch accidental use of newer APIs.
Without such a job, MSRV guarantees are not verifiable.
38 changes: 38 additions & 0 deletions .github/instructions/mcp-tools.instructions.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
---
applyTo: "crates/mcpls-core/src/mcp/**"
---

## Type safety

Tool parameter structs must use domain types, not raw primitives, for values with
protocol-specific conventions. Position fields (line, column) must be documented as
1-based in both the `///` doc comment and the `#[schemars(description)]` — the MCP↔LSP
coordinate conversion happens in the bridge layer, not here.

Path parameters must accept absolute paths. State this in the schema description; AI
clients use it to form requests correctly.

New tool structs require `#[derive(JsonSchema)]`. Omitting it silently removes the tool
from the schema exposed to clients — they cannot discover or invoke it.

## Error handling

Tool handlers must return structured MCP error responses on failure, never panic.
`unwrap()` and `expect()` in handler code kill the entire server process and drop all
active client sessions. Use `?` with proper error mapping instead.

All error variants must carry enough context for the caller to understand what went
wrong. A bare `"internal error"` string is not actionable. Include the file path,
method name, or other relevant detail.

## Idiomatic dispatch

When adding a new tool, register it in both the capability advertisement (tool list)
and the dispatch match arm. A tool missing from the list is undiscoverable; a tool
missing from dispatch panics at runtime. A compile-time check (e.g. a const assertion
or a test that calls every listed tool name through dispatch) is preferable to relying
on manual review.

Prefer exhaustive match arms over wildcard catch-alls in dispatch. A wildcard silently
swallows unhandled tool names; exhaustive matching makes unregistered names a compile
error.
80 changes: 80 additions & 0 deletions .github/instructions/rust-bridge.instructions.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,80 @@
---
applyTo: "crates/mcpls-core/src/bridge/**,crates/mcpls-core/src/lsp/**,crates/mcpls-core/src/lib.rs"
---

## Type safety

Position types (line, column) must be distinct newtypes, not bare `u32`. The MCP
convention (1-based) and the LSP convention (0-based) must be expressed in the type
system so that passing an MCP position directly to an LSP call is a compile error.

Protocol message variants must be an enum, not a string-matched dispatch. Matching on
`"textDocument/hover"` strings is fragile; a closed enum of known methods makes
unhandled cases visible at compile time and exhaustiveness-checked.

`InboundMessage` and similar enums that may gain variants in future protocol versions
must be `#[non_exhaustive]` so downstream crates are not broken by additions.

## Document state

Caching document content alongside a filesystem signature (`mtime` + size) prevents
stale reads after external edits. On every access, stat the file and compare against
the cached signature; re-read and re-notify the LSP server on mismatch.

Stat and read must use the same file handle to close the TOCTOU window. Open the file
once with `tokio::fs::File::open`, call `.metadata().await` on the handle, then read
through the same handle. A separate `tokio::fs::metadata` call before `read_to_string`
leaves a window where an atomic `rename` can change the file between the two calls.

Document version numbers must be strictly monotone per URI. On overflow, reset to 1
rather than saturating — after `didClose`/`didOpen` the server treats the document as
fresh and the version value is irrelevant to continuity.

## Diagnostics pipeline

When a feature draws from multiple independent sources (pull request + push cache),
fetch them concurrently with `tokio::try_join!` or `tokio::join!`. Awaiting them
sequentially serialises work that has no data dependency.

A cache miss must be distinguishable from an empty result. Returning `Vec::new()` for
both "no errors" and "not yet analysed" is ambiguous to callers. Use an `Option<Vec>`
or a dedicated status type to express the difference.

Deduplication keys should prefer structured fields (error code) over free-form message
text. When a code is present, key on `(range, severity, code)`. The doc comment must
describe the actual key — a mismatch between the comment and the implementation causes
silent behavioural bugs.

## Concurrency and lock scope

Background notification pumps must hold only the narrowest lock needed (e.g. a
`Mutex<Cache>`, not a `Mutex<WholeTranslator>`). A pump that shares a lock with request
handlers will stall when a handler holds the lock across a long-running await, causing
the notification channel to fill and the in-flight response to be undeliverable.

Never hold a `MutexGuard` across an `.await`. Extract needed values before the await,
or restructure so the guard is dropped before any I/O begins.

## File watcher

Glob patterns from external registrations must be matched against full absolute paths.
`globset` does not implicitly anchor bare patterns to a directory prefix — a pattern
like `*.rs` only matches a bare filename. Prepend `**/` to patterns that lack a leading
`/` or `**/` before compiling them into a `GlobSet`.

Noise filtering (build artifacts, VCS directories) must run before events enter any
bounded channel, not after. Filtering post-send wastes channel capacity and can stall
the OS watcher thread when the channel fills during high-churn builds.

Use `try_send` for the handoff from the OS watcher callback to the async processing
loop. The watcher callback runs on a system thread; blocking it with `send` on a full
channel stalls the OS-level event delivery.

## LSP client

Server-to-client requests (messages with both `method` and `id`) require a JSON-RPC
response. Unhandled methods must return error code `-32601` (Method Not Found). Dropping
them silently causes the LSP server to wait indefinitely for an acknowledgement.

`id`-only messages (no `method`, `result`, or `error`) are protocol errors and must not
be coerced into a successful `null` response.
34 changes: 34 additions & 0 deletions .github/instructions/rust-config.instructions.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
---
applyTo: "crates/mcpls-core/src/config/**,deny.toml,Cargo.toml,crates/*/Cargo.toml"
---

## Type safety and idiomatic config

Config option enums must derive `Default` with `#[default]` on the intended default
variant rather than implementing `Default` manually. This keeps the default co-located
with the type definition and is less error-prone.

Boolean config fields that represent a choice between more than two states should be
`enum`, not `bool`. A `bool` cannot be extended without a breaking change; an enum can
gain variants under `#[non_exhaustive]`.

New `#[serde(default)]` fields must have a test that deserialises a config with the
field omitted and asserts the default value. Cover both JSON and TOML — parsers can
behave differently on edge cases like empty strings and absent vs. null keys.

Enum config options must have a round-trip test for every variant. A typo in
`#[serde(rename_all = "...")]` silently changes the on-disk format and breaks existing
user configs without a compile error.

## MSRV and dependencies

When introducing a new dependency, check whether the functionality is already available
in `std` for the current MSRV. Prefer `std` over external crates to reduce
compile time and supply-chain surface.

New crates require a license check in the same PR. Add the license to `deny.toml`
before merging — `cargo deny check licenses` fails fast and blocks the pipeline.

If a new stable Rust API (e.g. `OnceLock` at 1.70 replacing `once_cell::sync::OnceCell`)
makes an existing dependency redundant, suggest removing the dependency and bumping
`rust-version` instead. Document the bump in CHANGELOG.
40 changes: 40 additions & 0 deletions .github/instructions/tests.instructions.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
---
applyTo: "crates/mcpls-core/tests/**,crates/mcpls-core/src/**/tests.rs,crates/mcpls-core/src/**/*_tests.rs"
---

## Test design

Tests for async code must use `#[tokio::test]` and exercise the actual async paths, not
synchronous wrappers that bypass scheduler interactions. Concurrency bugs (lock
contention, channel saturation) are only reachable through genuine async execution.

Tests that require an external binary must be marked `#[ignore = "Requires <binary>
in PATH"]`. CI runners do not have language server binaries; an unmarked test will fail
on every clean run.

## Correctness invariants to cover

Any code path that pairs a filesystem operation with a subsequent network notification
(open, close, re-open after external edit) must have a test that exercises the full
sequence using a mock network client and a real temporary file — not just the filesystem
primitive in isolation.

Glob matching tests must use absolute path strings as input. Passing a bare filename
to a glob set will pass even when the production code fails on absolute paths, because
`globset` anchoring behaviour differs between the two cases.

Round-trip tests for serialisable types must cover every variant, and must assert the
serialised string value, not just that deserialisation succeeds. A `rename_all` typo
produces a wrong on-disk format that round-trips correctly in isolation.

## Idiomatic test code

Prefer `assert_eq!` over `assert!(a == b)` — `assert_eq!` prints both values on
failure, making diagnosis faster.

Use `tempfile::tempdir()` for tests that need real filesystem paths. Hard-coded `/tmp`
paths cause test interference when multiple test threads run in parallel.

Prefer `#[tokio::test]` over `tokio::runtime::Runtime::block_on` in test bodies —
`block_on` does not integrate with tokio's test utilities (`time::pause`,
`time::advance`) needed for testing debounce and timeout logic.
Loading