Skip to content

feat(cli+api+mcp): agent walk-aware CLI + council doc fixes + 5 AI-user friction fixes#82

Merged
mbachaud merged 2 commits into
masterfrom
agent-cli-fixes
May 12, 2026
Merged

feat(cli+api+mcp): agent walk-aware CLI + council doc fixes + 5 AI-user friction fixes#82
mbachaud merged 2 commits into
masterfrom
agent-cli-fixes

Conversation

@mbachaud
Copy link
Copy Markdown
Owner

Summary

Two commits, two slices of work, both surfaced by AI-user testing of the CLI+MCP surface:

  1. 9dbd960Council prioritized doc fixes + agent walk-aware CLI. Closes the gap the first council flagged on the v1 CLI (PR feat(cli): v1 cold-start helix-cli (query/ingest/status/diag/config) #71): agents now drive genome lookups via subprocess CLI calls instead of MCP-injected context. Plus the README / ROSETTA / docs fixes the council prioritized.
  2. 0a68896Five friction points reported by a second AI-user pass on origin/master@93deaf2. Each is a one-shot fix with a focused unit test.

Branch is two commits ahead of origin/master. Full mock suite green (1924 passed, 0 failures, ~8min).

Commit 1 — Agent walk-aware CLI + council doc fixes

CLI agent surface (4 new subcommands; identical JSON shape to the matching MCP tools + HTTP endpoints):

  • helix packet <text> --task-type edit|ops|... — agent-safe bundle with verified / stale_risk / refresh_targets
  • helix gene get <id> / helix gene preview <id> --chars N — single-document inspection
  • helix neighbors <text> --k N — SEMA graph walk
  • helix refresh-targets <text> — reread plan only

api.HelixSession: promote gene_get, packet, refresh_targets, neighbors from v1.1-deferred to v1 — they back the four new CLI subcommands without requiring an HTTP server or MCP host. Pure in-process wrappers over Genome.get_gene, build_context_packet, and the existing SEMA codec. Read-only.

Docs: README adds an "Agent CLI surface (no server required)" section and sources the 28.7× / 5.4× headline against the reproducer at benchmarks/bench_rag_vs_sike_tokens.py. ROSETTA adds a "Response & routing types (STAYS — no biology twin)" section covering ContextWindow / ContextPacket / KnowBlock / MissBlock / RefreshTarget / ContextHealth / ContextItem / QueryResult / IngestResult / StatsResult. The HGT → cross_store_import row is annotated as a forward-pointer; the OPEN/EUCHROMATIN/HETEROCHROMATIN → OPEN/WARM/COLD row notes the rename is deferred to R3.

Commit 2 — Five AI-user friction fixes

# Friction reported Fix
1 helix query and helix diag corpus silently ignored HELIX_CONFIG / HELIX_GENOME_PATH — read/created ./genome.db while helix status looked at the configured genome api.py: open_session() now routes through load_config() (single-source with helix status)
2 helix status --json falsely reported a healthy-but-slow server as unreachable (1.5s timeout, cold-start /health takes 5-10s) helix_status.py: default 1.5s → 10s, override via HELIX_STATUS_TIMEOUT_S, malformed-value fallback warns instead of crashes
3 MCP helix_context failed schema validation — /context returns the Continue HTTP list shape, MCP host expected Dict[str, Any] New _unwrap_context_list() helper unwraps the single-entry list; applied to helix_context and helix_document_query. Continue IDE compatibility untouched.
4 helix ingest README.md raised TranscriptionError: Pack failed: Ribosome is disabled on cold-start install config.py: load_config() auto-flips ingestion.backend"cpu" when ribosome is disabled. Routes to the spaCy/heuristic CpuTagger that's already in tree. Explicit cpu/hybrid configs untouched. LLM-free pillar intact. Logs WARNING.
5 helix.exe broken (pip console script pointed at deleted editable path; Scripts dir off PATH) Added helix_context/cli/__main__.py so python -m helix_context.cli works as a fallback. Recovery recipe + pip install --force-reinstall --no-deps documented in docs/clients/cli.md.

Test plan

  • 38 new unit tests (10 cli_packet/refresh/neighbors/gene + 10 api_walk + 3 config + 6 mcp_server + 9 dispatcher/integration). All pass.
  • Full mock suite: python -m pytest -m "not live and not requires_rocm and not requires_real_cuda and not requires_mps"1924 passed, 0 failures, 10 skipped, 26 deselected, 2 xfailed in 470s.
  • Manual smoke against a real genome: helix packet "foo" --json, helix gene get <id> --json, helix refresh-targets "edit X" — not done here; reviewer should sanity-check on their workstation before merge.
  • Manual MCP smoke: confirm Claude Code / Cursor see helix_context returning a dict (not a list) after the unwrap fix.

Related issues

This PR does not close any open issue outright. Two issues partially overlap and should be reconciled at merge time:

Note on the MCP slimdown plan (#78 / #79 / #80 / #81)

Building the four CLI peers in this PR was originally framed under the council's recommendation to "complete the CLI as an agent tool." The earlier round of analysis on PR #78 (MCP slimdown spec) found that:

  • The 13-tool demotion-to-CLI premise rests on CLI peers existing — 9 of 13 still don't, and would degrade agent UX on day-one.
  • The latency-defense argument the spec uses for keeping 6 MCP tools (in-stdio beats subprocess.run) applies equally to most of the demoted tools agents walk during a multi-step loop.
  • The "1038 → 450 adapter LOC" headline is engineering convenience, not user value.
  • Agent UX is the product; shrinking the easiest agent surface to save adapter code works against the product thesis.

Recommendation kept from that analysis: don't merge #78 as written. What stays valuable from the slimdown work:

🤖 Generated with Claude Code

mbachaud and others added 2 commits May 12, 2026 11:59
Council review flagged the v1 CLI as operator-only and the docs as
having a stale endpoint table + unsourced headline benchmark. This
commit closes both gaps:

- **CLI agent surface**: add `helix packet`, `helix gene get|preview`,
  `helix neighbors`, `helix refresh-targets`. Agents can now drive
  full retrieval+walk loops via subprocess calls without an MCP host
  or running HTTP server. JSON shapes match the matching HTTP
  endpoints + MCP tools so callers swap surfaces without changing
  logic.

- **api.py**: promote `gene_get` / `packet` / `refresh_targets` /
  `neighbors` from v1.1-deferred to v1 on `HelixSession`. Pure
  in-process wrappers over `Genome.get_gene`,
  `build_context_packet`, and the existing SEMA codec. Read-only.

- **README**: new "Agent CLI surface (no server required)" section;
  headline `28.7× / 5.4×` claim now cites
  `benchmarks/bench_rag_vs_sike_tokens.py` (reproducer) and
  `docs/benchmarks/BENCHMARKS.md` (methodology).

- **ROSETTA**: new "Response & routing types (STAYS)" section covering
  `ContextWindow` / `ContextPacket` / `KnowBlock` / `MissBlock` /
  `RefreshTarget` / `ContextHealth` / `ContextItem` / `QueryResult` /
  `IngestResult` / `StatsResult`. Annotate the `HGT →
  cross_store_import` row as forward-pointer (no code under either
  name today). Annotate `OPEN/EUCHROMATIN/HETEROCHROMATIN →
  OPEN/WARM/COLD` as deferred-to-R3 since `ChromatinState` in
  `schemas.py` still emits bio names.

- **cli.md**: title-page "agent vs. operator" decision table + per-
  subcommand reference for the four new commands.

Tests: 28 new (10 api_walk + 18 cli_*); full mock suite stays green
(1914 passed, 0 failures).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Reported against origin/master@93deaf2. Each fix has a focused unit
test; full mock suite stays green (1924 passed, 0 failures).

1. **api.py: open_session() now honors HELIX_CONFIG / HELIX_GENOME_PATH.**
   Pre-fix every cold-start CLI subcommand (query, diag corpus, packet,
   gene, neighbors, refresh-targets) instantiated HelixConfig() directly
   and silently created/read ./genome.db regardless of the operator's
   config — so `helix status` looked at the configured genome but
   `helix query` looked at an empty one. Route through load_config() so
   HELIX_CONFIG + HELIX_GENOME_PATH work the same way as `helix status`
   already honors them.

2. **helix_status.py: bump /health probe timeout 1.5s → 10s, override via
   HELIX_STATUS_TIMEOUT_S.** Cold-start /health can take 5-10s under
   model warmup + manager init + WAL replay; the old 1.5s timeout
   silently reported a healthy-but-slow server as unreachable in
   `helix status --json`. New default + env-var override + malformed-
   value fallback warning.

3. **mcp_server.py: unwrap the Continue list shape in helix_context /
   helix_document_query tools.** /context returns the Continue HTTP
   context-provider list ([{name, description, content, ...}]) to stay
   drop-in compatible with Continue IDE. MCP hosts validate tool returns
   against the declared Dict[str, Any] schema and rejected the list.
   New _unwrap_context_list helper flattens the single-entry list,
   passes _http error envelopes through, defensively wraps unexpected
   list shapes with a diagnostic note.

4. **config.py: auto-fallback ingestion.backend → "cpu" when
   ribosome.enabled = false.** The two settings contradicted each other
   — ingest with the ribosome disabled raised TranscriptionError: Pack
   failed: Ribosome is disabled on the first chunk. The CpuTagger
   (spaCy + heuristic) was always available but never reached because
   ingestion.backend defaulted to "ollama". load_config() now flips
   ingestion to the CPU path and logs a WARNING. Honors explicit cpu /
   hybrid settings without override. Keeps the LLM-free pillar
   intact and finally makes cold-start `helix ingest` actually work
   on a fresh install.

5. **cli/__main__.py + cli.md: `python -m helix_context.cli` works as a
   console-script fallback.** When the pip-installed `helix.exe` points
   at a deleted editable-install path or the Scripts dir is off PATH
   (common AI-user environment issue), agents now have a module-direct
   invocation that bypasses the console script entirely. Documented
   alongside the `pip install --force-reinstall --no-deps` recipe.

Tests: 10 new unit tests (1 api_walk + 3 config + 6 mcp_server).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant