Skip to content

v0.12 quality pass: host-sample, config editor, OSS docs, code-quality polish#2

Merged
marceloceccon merged 7 commits into
mainfrom
feat/v0.12-quality-pass
May 4, 2026
Merged

v0.12 quality pass: host-sample, config editor, OSS docs, code-quality polish#2
marceloceccon merged 7 commits into
mainfrom
feat/v0.12-quality-pass

Conversation

@marceloceccon
Copy link
Copy Markdown
Member

Summary

Bundles the v0.12-line work that's been queued on this branch: a major feature (host-sample participants + interactive config editor), OSS-baseline documentation additions (SECURITY, CONTRIBUTING, CODE_OF_CONDUCT, issue/PR templates), and a code-quality pass driven by three rounds of self-review using the project's own consensus tool.

Posted as a single PR rather than splitting per-concern because the work was developed sequentially toward the same v0.12 release and the narrative is clearer end-to-end. Future work will land per-PR; this is a one-time correction for commits that had accumulated directly on main.

Commits (7)

Each commit's body is the authoritative explanation. Read commit-by-commit rather than the squashed diff.

  1. 00db958 docs: SECURITY policy and CONTRIBUTING guide; untrack progress.md
  2. 40275c8 feat: host-sample participants and interactive config editor
  3. d592c8c build: ignore package-lock.json
  4. 07a75f8 refactor: typed SamplingError for host-sample failures
  5. 9433a01 test: cover adapter HTTP caller and progress.ts; ratchet coverage thresholds
  6. bae5656 refactor: add SamplingError.toJSON() for structured logging
  7. 49e609d docs: Code of Conduct and .github issue/PR templates

Highlights

Feature (commit 2)

  • kind: "host-sample" participants — the calling MCP host answers as a roundtable seat via sampling/createMessage. Status: experimental, Claude Desktop only today (Claude Code tracking issue: [Feature Request] Support for MCP Sampling to leverage Claude Max subscriptions and reduce API costs anthropics/claude-code#1785). Pre-flight capability gate fails fast with a clear isError instead of hanging.
  • ai-consensus-mcp config subcommand — @inquirer/prompts TUI for editing the JSON config; schema-validated with the same Zod schema the server uses on startup; atomic writes via fs.rename(2).
  • Default Anthropic apiKeyEnv namespaced as CONSENSUS_ANTHROPIC_API_KEY to avoid colliding with Claude Code's ANTHROPIC_API_KEY auto-detection on Claude Max subscriptions.

Code quality (commits 4–6)

  • Typed SamplingError class with discriminated code field (missing-entry / host-error / unsupported-content / empty-response), ES2022 cause, and toJSON() for structured logging pipelines.
  • Coverage: 66.78% → 78.45% statements; 67.79% → 80.05% lines. Thresholds ratcheted from 55/47/57/55 to 78/69/83/80.
  • 137 → 159 tests across 13 files. New: adapter-http.test.ts (HTTP caller via fetch mock + SSE streams), progress.test.ts (engine-event progress bridge with a duck-typed mock engine).

Docs / governance (commits 1, 7)

  • Real SECURITY.md (replaces the unmodified GitHub template).
  • CONTRIBUTING.md with quickstart, command reference, PR norms.
  • CODE_OF_CONDUCT.md adopting Contributor Covenant 2.1 by reference.
  • .github/ISSUE_TEMPLATE/{bug_report,feature_request,config}.md/.yml and PULL_REQUEST_TEMPLATE.md.

Driven by self-review

Three consecutive consensus runs (against the project's own MCP tool — meta) shaped this PR:

  1. OSS-quality baseline review — flagged the unmodified-template SECURITY.md, missing CONTRIBUTING.md, and other governance gaps. Addressed by commits 1 and 7.
  2. Top-tier code review of commit 2 — flagged 66% coverage and untyped errors in callViaSampling as the remaining gaps to top-tier. Addressed by commits 4 and 5.
  3. Top-tier follow-up review of commits 4–5 — flagged SamplingError.toJSON() missing for structured logging. Addressed by commit 6.

Test plan

  • npm run check green locally: typecheck + lint + format + tests + coverage thresholds
  • 159 tests across 13 files passing in ~1s
  • Coverage above the new Phase 2 thresholds (78 / 69 / 83 / 80)
  • CI re-runs check + stdio smoke test on Node 20 and 22

Notes for the reviewer

  • Merge strategy: "Create a merge commit" or "Rebase and merge" — not squash — to preserve the 7-commit narrative.
  • The commit boundaries are intentional. Each commit is independently reviewable and was authored to be load-bearing in history.
  • The host-sample feature is shipped as experimental; users on Claude Code, Cursor, or Windsurf get a clear error directing them to a configured provider until upstream sampling support lands.

- SECURITY.md: replace the unmodified GitHub template with a real policy:
  supported versions (0.11.x only), GitHub private vulnerability reporting
  as the channel, scope (in: this repo + npm package; out: ai-consensus-core,
  MCP hosts, upstream LLMs), and 5/10-business-day acknowledge/triage SLAs.

- CONTRIBUTING.md: quickstart (npm install + npm run check), command-
  reference table, PR norms (focused scope, tests + coverage ratchet,
  CHANGELOG [Unreleased] entry, no version bumps), and bug-report
  checklist with security carve-out.

- progress.md: remove from the index and add to .gitignore. It was an
  internal dev log that had been shipped in the repo with no value to
  consumers. History is preserved in git; the file stays on disk for
  local note-keeping but is no longer tracked.
Two features developed together for the v0.x line, plus the doc/
default polish that makes them honest in Claude Code today.

host-sample participants
  A participant can be answered by the calling MCP host instead of a
  configured provider via MCP `sampling/createMessage` — no extra API
  key, no extra provider entry. The host's LLM gets the persona system
  prompt and answers; the engine continues.

  Status: experimental, **Claude Desktop only** today. Claude Code,
  Cursor, and Windsurf do not yet advertise the `sampling` capability
  (tracking: anthropics/claude-code#1785). The pre-flight gate returns
  a clear isError naming the participant rather than hanging — the
  message names Claude Desktop and the upstream issue.

  - src/config.ts: ParticipantConfigSchema becomes a discriminated
    union (provider | host-sample); LoadedConfig.hostSampleParticipants.
  - src/adapter.ts: createSamplingCaller issues server.createMessage();
    createRoutedCaller dispatches per participant.
  - src/server.ts: ensureSamplingSupported pre-flight gate against
    server.getClientCapabilities().sampling.
  - src/presets/resolve-panel.ts: host-sample propagates through the
    preset panel resolver alongside provider-backed entries.
  - tests: 5 unit tests for the sampling caller (positive path,
    modelHint forwarding, non-text rejection, host-error wrapping,
    missing-entry guard) + config-loader and server tests for the gate
    + end-to-end propagation in resolve-panel.

interactive config editor
  New `ai-consensus-mcp config` subcommand (alias: `configure`) — an
  @inquirer/prompts TUI for editing the entire JSON config: providers,
  participants, judge, defaults. Schema-validated against the same Zod
  schema the server uses on startup. Atomic writes via fs.rename(2);
  "Discard & exit" / Ctrl-C aborts cleanly without writing.

  - src/cli/config.ts (new): the TUI flow.
  - src/cli/main.ts: subcommand dispatch + help text.
  - vitest.config.ts: excludes src/cli/config.ts from coverage —
    every-prompt unit-mocking would be more brittle than the value adds.
  - package.json: adds @inquirer/prompts dependency.

defaults & docs
  - Default Anthropic apiKeyEnv namespaced as CONSENSUS_ANTHROPIC_API_KEY
    in the example config and the editor's starter config — avoids
    colliding with Claude Code's ANTHROPIC_API_KEY auto-detection on
    Claude Max subscriptions.
  - consensus.config.example.json no longer includes a host-sample seat
    by default; users opt in by editing the config.
  - README lead bullet and the dedicated host-sample section now state
    up front that the feature is experimental and Claude Desktop only,
    linking the upstream tracking issue.
  - docs/install.md env-var example renamed to match.
  - CHANGELOG [Unreleased]: existing Added blocks for host-sample and
    the editor, plus a new Changed block describing the experimental
    repositioning and the env-var rename (not breaking — only ships
    new defaults; existing configs keep whatever apiKeyEnv they set).
CI policy is `npm install` (not `npm ci`) without a committed lockfile —
ai-consensus-core must already be on npm to resolve, per the CI comment
in .github/workflows/ci.yml. Adding the lockfile to .gitignore stops it
from showing up as untracked when local installs generate it (notably
after the @inquirer/prompts dep was added in the previous commit).
callViaSampling and createSamplingCaller's routing guard previously
threw plain `Error` instances; the only way to distinguish failure
modes was to pattern-match the message string. The consensus code
review of 2701dd8 called this out as inconsistent with the rich
`Error` returned by ensureSamplingSupported.

This commit introduces a `SamplingError` class with a discriminated
`code` field for the four runtime failure modes:

  - missing-entry        routing bug (server marked the participant
                         host-sample but no meta exists)
  - host-error           server.createMessage() rejected; original
                         error attached as ES2022 `cause`
  - unsupported-content  host returned a non-text block; the type
                         (image/audio/...) is in `contentType`
  - empty-response       host returned text, but the string is empty

`participantId` is always set so logs and triage stay specific.
Message strings are unchanged, so any external tooling that already
greps them still works.

src/adapter.ts grows ~30 LOC (the class + the throw-site swaps).
Tests in adapter-sampling.test.ts now assert on `instanceof
SamplingError`, `code`, `participantId`, `contentType`, and `cause`
instead of regex-matching messages — six tests total (added a
sixth case for `empty-response`, which the previous suite hadn't
covered).

CHANGELOG entry under [Unreleased] with the recommended switch/case
shape for callers.
…esholds

Two new test files driven by consensus code-review feedback. Pushes
global coverage from 66.78% / 67.79% (statements / lines) to 78.45%
/ 80.05% — both line coverage and the Phase 2 ratchet target met.

src/__tests__/adapter-http.test.ts (9 tests, +250 LOC)
  Mocks globalThis.fetch and constructs SSE-formatted ReadableStreams
  to exercise createOpenAICompatibleCaller end to end without touching
  a real provider. Covers:
    - streamed-delta assembly + usage parsing
    - computed totalTokens fallback when provider omits the field
    - cross-chunk JSON line reassembly
    - data: lines whose payload fails JSON.parse (silently skipped)
    - non-OK HTTP responses: status, statusText, truncated body
    - empty-stream rejection
    - missing provider mapping
    - mapped provider id not loaded
    - bearer auth + extra headers + body shape forwarded correctly
  adapter.ts coverage: 21.51% → 90.58% statements.

src/__tests__/progress.test.ts (7 tests, +180 LOC)
  Duck-typed mock engine (on/off/emit) drives wireEngineProgress's
  handlers directly so the heavyweight ConsensusEngine isn't needed.
  Covers:
    - total = maxRounds + (judgeEnabled ? 1 : 0)
    - progress-counter increments only on roundComplete + synthesisComplete
    - participant lifecycle rendering (success and error tags)
    - confidenceUpdate, disagreementDetected, earlyStop, synthesisStart,
      finalResult, engine error
    - detach() removes every registered listener
    - sendNotification rejections swallowed (best-effort delivery)
  progress.ts coverage: 0% → ~100%.

vitest.config.ts
  Thresholds ratcheted from 55/47/57/55 to 78/69/83/80 (Phase 2
  ratchet, anchored at the new floor with a small buffer below the
  current numbers). src/presets/**/*.ts keeps its stricter floor
  (90/75/95/90) unchanged. Comment block updated to record the bump.

CHANGELOG entry under [Unreleased] with the before/after table.

Test count: 137 → 154 (16 new + the empty-response sampling test
added in cd44209).
The latest consensus code review flagged this as the strongest concrete
critique remaining on cd44209: a plain Error subclass loses every
non-enumerable own property under JSON.stringify, so logging pipelines
that serialise errors would see only the `message` string — `code`,
`participantId`, `contentType`, `cause`, and `stack` would silently
disappear. This is a real top-tier polish gap for a typed error.

The new toJSON() returns a structured snapshot with all the custom
fields plus a JSON-safe cause:
  - Error causes summarised as { name, message }
  - Plain object/string/number/boolean causes round-tripped through
    JSON to drop functions/symbols/circular refs cleanly
  - Anything that can't be JSON-stringified (e.g. BigInt, circular
    refs) falls back to "[unserializable cause]" via a try/catch
  - contentType and cause are omitted from the output when unset

Tests added in adapter-sampling.test.ts cover all four shapes:
Error cause, plain-object cause, unserializable cause (BigInt), and
the no-optional-fields case. Test count for this file: 6 → 10.

Also fixes a tiny markdown table-header padding mis-format in
CHANGELOG.md that snuck through commit f1ec22e — extending the
existing typed-error block with the toJSON note also re-aligned
the threshold table.
Closes the documentation gap the prior consensus code review (the
"good but not top-tier" verdict on docs) called out as the missing
piece between this project and a top-tier OSS surface.

CODE_OF_CONDUCT.md
  Adopts Contributor Covenant 2.1 by reference rather than vendoring
  the canonical text. Many top-tier libs (drizzle, hono, others) ship
  the link-only form because it stays current as the Covenant evolves
  and avoids forking the standards text. Reporting channel is GitHub's
  private security-advisory flow with a [conduct] title prefix; SLAs
  match SECURITY.md (5/10 business days).

.github/ISSUE_TEMPLATE/
  - bug_report.md: what happened / expected / repro / environment
    (server version, Node, MCP host, OS) / additional context, with
    a security carve-out pointing at SECURITY.md.
  - feature_request.md: problem / proposed solution / alternatives /
    additional context, with a pointer to CONTRIBUTING.md for larger
    proposals.
  - config.yml: blank_issues_enabled=false; contact_links surface the
    Security and Code-of-Conduct flows so users land on the right
    channel before opening a public issue.

.github/PULL_REQUEST_TEMPLATE.md
  Summary / changes / checklist that mirrors the PR norms already in
  CONTRIBUTING.md (tests, `npm run check` green, CHANGELOG entry, no
  version bumps).

CONTRIBUTING.md
  Now points at the Code of Conduct in the lead paragraph.

CHANGELOG.md
  New "### Added" block under [Unreleased] describing the additions.
@marceloceccon marceloceccon merged commit 1382f75 into main May 4, 2026
6 checks passed
@marceloceccon marceloceccon deleted the feat/v0.12-quality-pass branch May 4, 2026 15:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant