v0.12 quality pass: host-sample, config editor, OSS docs, code-quality polish#2
Merged
Merged
Conversation
- SECURITY.md: replace the unmodified GitHub template with a real policy: supported versions (0.11.x only), GitHub private vulnerability reporting as the channel, scope (in: this repo + npm package; out: ai-consensus-core, MCP hosts, upstream LLMs), and 5/10-business-day acknowledge/triage SLAs. - CONTRIBUTING.md: quickstart (npm install + npm run check), command- reference table, PR norms (focused scope, tests + coverage ratchet, CHANGELOG [Unreleased] entry, no version bumps), and bug-report checklist with security carve-out. - progress.md: remove from the index and add to .gitignore. It was an internal dev log that had been shipped in the repo with no value to consumers. History is preserved in git; the file stays on disk for local note-keeping but is no longer tracked.
Two features developed together for the v0.x line, plus the doc/ default polish that makes them honest in Claude Code today. host-sample participants A participant can be answered by the calling MCP host instead of a configured provider via MCP `sampling/createMessage` — no extra API key, no extra provider entry. The host's LLM gets the persona system prompt and answers; the engine continues. Status: experimental, **Claude Desktop only** today. Claude Code, Cursor, and Windsurf do not yet advertise the `sampling` capability (tracking: anthropics/claude-code#1785). The pre-flight gate returns a clear isError naming the participant rather than hanging — the message names Claude Desktop and the upstream issue. - src/config.ts: ParticipantConfigSchema becomes a discriminated union (provider | host-sample); LoadedConfig.hostSampleParticipants. - src/adapter.ts: createSamplingCaller issues server.createMessage(); createRoutedCaller dispatches per participant. - src/server.ts: ensureSamplingSupported pre-flight gate against server.getClientCapabilities().sampling. - src/presets/resolve-panel.ts: host-sample propagates through the preset panel resolver alongside provider-backed entries. - tests: 5 unit tests for the sampling caller (positive path, modelHint forwarding, non-text rejection, host-error wrapping, missing-entry guard) + config-loader and server tests for the gate + end-to-end propagation in resolve-panel. interactive config editor New `ai-consensus-mcp config` subcommand (alias: `configure`) — an @inquirer/prompts TUI for editing the entire JSON config: providers, participants, judge, defaults. Schema-validated against the same Zod schema the server uses on startup. Atomic writes via fs.rename(2); "Discard & exit" / Ctrl-C aborts cleanly without writing. - src/cli/config.ts (new): the TUI flow. - src/cli/main.ts: subcommand dispatch + help text. - vitest.config.ts: excludes src/cli/config.ts from coverage — every-prompt unit-mocking would be more brittle than the value adds. - package.json: adds @inquirer/prompts dependency. defaults & docs - Default Anthropic apiKeyEnv namespaced as CONSENSUS_ANTHROPIC_API_KEY in the example config and the editor's starter config — avoids colliding with Claude Code's ANTHROPIC_API_KEY auto-detection on Claude Max subscriptions. - consensus.config.example.json no longer includes a host-sample seat by default; users opt in by editing the config. - README lead bullet and the dedicated host-sample section now state up front that the feature is experimental and Claude Desktop only, linking the upstream tracking issue. - docs/install.md env-var example renamed to match. - CHANGELOG [Unreleased]: existing Added blocks for host-sample and the editor, plus a new Changed block describing the experimental repositioning and the env-var rename (not breaking — only ships new defaults; existing configs keep whatever apiKeyEnv they set).
CI policy is `npm install` (not `npm ci`) without a committed lockfile — ai-consensus-core must already be on npm to resolve, per the CI comment in .github/workflows/ci.yml. Adding the lockfile to .gitignore stops it from showing up as untracked when local installs generate it (notably after the @inquirer/prompts dep was added in the previous commit).
callViaSampling and createSamplingCaller's routing guard previously
threw plain `Error` instances; the only way to distinguish failure
modes was to pattern-match the message string. The consensus code
review of 2701dd8 called this out as inconsistent with the rich
`Error` returned by ensureSamplingSupported.
This commit introduces a `SamplingError` class with a discriminated
`code` field for the four runtime failure modes:
- missing-entry routing bug (server marked the participant
host-sample but no meta exists)
- host-error server.createMessage() rejected; original
error attached as ES2022 `cause`
- unsupported-content host returned a non-text block; the type
(image/audio/...) is in `contentType`
- empty-response host returned text, but the string is empty
`participantId` is always set so logs and triage stay specific.
Message strings are unchanged, so any external tooling that already
greps them still works.
src/adapter.ts grows ~30 LOC (the class + the throw-site swaps).
Tests in adapter-sampling.test.ts now assert on `instanceof
SamplingError`, `code`, `participantId`, `contentType`, and `cause`
instead of regex-matching messages — six tests total (added a
sixth case for `empty-response`, which the previous suite hadn't
covered).
CHANGELOG entry under [Unreleased] with the recommended switch/case
shape for callers.
…esholds
Two new test files driven by consensus code-review feedback. Pushes
global coverage from 66.78% / 67.79% (statements / lines) to 78.45%
/ 80.05% — both line coverage and the Phase 2 ratchet target met.
src/__tests__/adapter-http.test.ts (9 tests, +250 LOC)
Mocks globalThis.fetch and constructs SSE-formatted ReadableStreams
to exercise createOpenAICompatibleCaller end to end without touching
a real provider. Covers:
- streamed-delta assembly + usage parsing
- computed totalTokens fallback when provider omits the field
- cross-chunk JSON line reassembly
- data: lines whose payload fails JSON.parse (silently skipped)
- non-OK HTTP responses: status, statusText, truncated body
- empty-stream rejection
- missing provider mapping
- mapped provider id not loaded
- bearer auth + extra headers + body shape forwarded correctly
adapter.ts coverage: 21.51% → 90.58% statements.
src/__tests__/progress.test.ts (7 tests, +180 LOC)
Duck-typed mock engine (on/off/emit) drives wireEngineProgress's
handlers directly so the heavyweight ConsensusEngine isn't needed.
Covers:
- total = maxRounds + (judgeEnabled ? 1 : 0)
- progress-counter increments only on roundComplete + synthesisComplete
- participant lifecycle rendering (success and error tags)
- confidenceUpdate, disagreementDetected, earlyStop, synthesisStart,
finalResult, engine error
- detach() removes every registered listener
- sendNotification rejections swallowed (best-effort delivery)
progress.ts coverage: 0% → ~100%.
vitest.config.ts
Thresholds ratcheted from 55/47/57/55 to 78/69/83/80 (Phase 2
ratchet, anchored at the new floor with a small buffer below the
current numbers). src/presets/**/*.ts keeps its stricter floor
(90/75/95/90) unchanged. Comment block updated to record the bump.
CHANGELOG entry under [Unreleased] with the before/after table.
Test count: 137 → 154 (16 new + the empty-response sampling test
added in cd44209).
The latest consensus code review flagged this as the strongest concrete
critique remaining on cd44209: a plain Error subclass loses every
non-enumerable own property under JSON.stringify, so logging pipelines
that serialise errors would see only the `message` string — `code`,
`participantId`, `contentType`, `cause`, and `stack` would silently
disappear. This is a real top-tier polish gap for a typed error.
The new toJSON() returns a structured snapshot with all the custom
fields plus a JSON-safe cause:
- Error causes summarised as { name, message }
- Plain object/string/number/boolean causes round-tripped through
JSON to drop functions/symbols/circular refs cleanly
- Anything that can't be JSON-stringified (e.g. BigInt, circular
refs) falls back to "[unserializable cause]" via a try/catch
- contentType and cause are omitted from the output when unset
Tests added in adapter-sampling.test.ts cover all four shapes:
Error cause, plain-object cause, unserializable cause (BigInt), and
the no-optional-fields case. Test count for this file: 6 → 10.
Also fixes a tiny markdown table-header padding mis-format in
CHANGELOG.md that snuck through commit f1ec22e — extending the
existing typed-error block with the toJSON note also re-aligned
the threshold table.
Closes the documentation gap the prior consensus code review (the
"good but not top-tier" verdict on docs) called out as the missing
piece between this project and a top-tier OSS surface.
CODE_OF_CONDUCT.md
Adopts Contributor Covenant 2.1 by reference rather than vendoring
the canonical text. Many top-tier libs (drizzle, hono, others) ship
the link-only form because it stays current as the Covenant evolves
and avoids forking the standards text. Reporting channel is GitHub's
private security-advisory flow with a [conduct] title prefix; SLAs
match SECURITY.md (5/10 business days).
.github/ISSUE_TEMPLATE/
- bug_report.md: what happened / expected / repro / environment
(server version, Node, MCP host, OS) / additional context, with
a security carve-out pointing at SECURITY.md.
- feature_request.md: problem / proposed solution / alternatives /
additional context, with a pointer to CONTRIBUTING.md for larger
proposals.
- config.yml: blank_issues_enabled=false; contact_links surface the
Security and Code-of-Conduct flows so users land on the right
channel before opening a public issue.
.github/PULL_REQUEST_TEMPLATE.md
Summary / changes / checklist that mirrors the PR norms already in
CONTRIBUTING.md (tests, `npm run check` green, CHANGELOG entry, no
version bumps).
CONTRIBUTING.md
Now points at the Code of Conduct in the lead paragraph.
CHANGELOG.md
New "### Added" block under [Unreleased] describing the additions.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Bundles the v0.12-line work that's been queued on this branch: a major feature (host-sample participants + interactive config editor), OSS-baseline documentation additions (SECURITY, CONTRIBUTING, CODE_OF_CONDUCT, issue/PR templates), and a code-quality pass driven by three rounds of self-review using the project's own consensus tool.
Posted as a single PR rather than splitting per-concern because the work was developed sequentially toward the same v0.12 release and the narrative is clearer end-to-end. Future work will land per-PR; this is a one-time correction for commits that had accumulated directly on
main.Commits (7)
Each commit's body is the authoritative explanation. Read commit-by-commit rather than the squashed diff.
00db958docs: SECURITY policy and CONTRIBUTING guide; untrack progress.md40275c8feat: host-sample participants and interactive config editord592c8cbuild: ignore package-lock.json07a75f8refactor: typed SamplingError for host-sample failures9433a01test: cover adapter HTTP caller and progress.ts; ratchet coverage thresholdsbae5656refactor: add SamplingError.toJSON() for structured logging49e609ddocs: Code of Conduct and .github issue/PR templatesHighlights
Feature (commit 2)
kind: "host-sample"participants — the calling MCP host answers as a roundtable seat viasampling/createMessage. Status: experimental, Claude Desktop only today (Claude Code tracking issue: [Feature Request] Support for MCP Sampling to leverage Claude Max subscriptions and reduce API costs anthropics/claude-code#1785). Pre-flight capability gate fails fast with a clearisErrorinstead of hanging.ai-consensus-mcp configsubcommand —@inquirer/promptsTUI for editing the JSON config; schema-validated with the same Zod schema the server uses on startup; atomic writes viafs.rename(2).apiKeyEnvnamespaced asCONSENSUS_ANTHROPIC_API_KEYto avoid colliding with Claude Code'sANTHROPIC_API_KEYauto-detection on Claude Max subscriptions.Code quality (commits 4–6)
SamplingErrorclass with discriminatedcodefield (missing-entry/host-error/unsupported-content/empty-response), ES2022cause, andtoJSON()for structured logging pipelines.adapter-http.test.ts(HTTP caller viafetchmock + SSE streams),progress.test.ts(engine-event progress bridge with a duck-typed mock engine).Docs / governance (commits 1, 7)
SECURITY.md(replaces the unmodified GitHub template).CONTRIBUTING.mdwith quickstart, command reference, PR norms.CODE_OF_CONDUCT.mdadopting Contributor Covenant 2.1 by reference..github/ISSUE_TEMPLATE/{bug_report,feature_request,config}.md/.ymlandPULL_REQUEST_TEMPLATE.md.Driven by self-review
Three consecutive consensus runs (against the project's own MCP tool — meta) shaped this PR:
SECURITY.md, missingCONTRIBUTING.md, and other governance gaps. Addressed by commits 1 and 7.callViaSamplingas the remaining gaps to top-tier. Addressed by commits 4 and 5.SamplingError.toJSON()missing for structured logging. Addressed by commit 6.Test plan
npm run checkgreen locally: typecheck + lint + format + tests + coverage thresholdscheck+ stdio smoke test on Node 20 and 22Notes for the reviewer
host-samplefeature is shipped as experimental; users on Claude Code, Cursor, or Windsurf get a clear error directing them to a configured provider until upstream sampling support lands.