Skip to content

feat(tools): add oc_journal_compact deterministic core#1442

Merged
shaun0927 merged 6 commits into
developfrom
feat/1434-journal-compact
May 28, 2026
Merged

feat(tools): add oc_journal_compact deterministic core#1442
shaun0927 merged 6 commits into
developfrom
feat/1434-journal-compact

Conversation

@shaun0927
Copy link
Copy Markdown
Owner

Summary

  • New oc_journal_compact MCP tool with three strategies:
    • recent_k (deterministic, default) — concat summaries, truncate to fit token budget.
    • checkpoint_only (deterministic) — milestone-only + last checkpoint.
    • sampling (host-mediated) — sampling/createMessage with strict fallback to unsupported_by_host when client capability absent.
  • Output: { summary, facts, open_assertions, last_checkpoint?, tokens_estimated, strategy_used }.

SSOT (#1359) alignment

  • No server-side LLM ever — only the host's sampling capability.
  • Deterministic by default. Host capability is strictly optional.
  • Read-only annotation.

Files

  • src/tools/oc-journal-compact.ts (new)
  • src/tools/index.ts (register)
  • src/types/tool-annotations.ts (READ_ONLY entry)
  • docs/mcp/tool-annotations.md (markdown mirror)
  • tests/tools/oc-journal-compact.test.ts — 8 tests covering all strategies, truncation, failed-assert surfacing, unknown-strategy rejection, session filter.

Out of scope (Part 2 of #1434)

  • Integration test that round-trips a full trajectory through compaction and resumes execution against the original oc_assert contract.
  • docs/skills/trajectory-compaction.md with recommended cadence (20–50 step).

Test plan

  • tests/tools/oc-journal-compact.test.ts — 8/8 pass.
  • tests/unit/tool-annotations.test.ts — 9/9 pass (markdown mirror sync).

Part 1 of #1434.

Co-Authored-By: Claude Opus 4.7 (1M context) noreply@anthropic.com

New `oc_journal_compact` MCP tool that compresses a sliding window of
journal entries into a model-friendly summary. Three strategies:

  - `recent_k` (default): deterministic concatenation of summaries
    truncated to fit a token budget (~4 chars/token heuristic, same
    style as src/mcp/output-observability.ts).
  - `checkpoint_only`: milestone-flagged entries only, plus the
    last successful oc_checkpoint.
  - `sampling`: host-mediated summarisation via
    `sampling/createMessage`. Returns `unsupported_by_host` when the
    client lacks the `sampling` capability. OpenChrome never falls
    back to a server-side LLM.

Output always includes facts, open failed assertions, and an
estimated token count.

Part 1 of #1434. The integration test that round-trips a real
trajectory through compaction + resumed execution against the
original oc_assert contract, plus the recommended-cadence docs,
land in Part 2.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@chatgpt-codex-connector
Copy link
Copy Markdown

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.

shaun0927 and others added 3 commits May 28, 2026 22:51
CI fixes for #1442:
- src/tools/index.ts: register oc_journal_compact in TOOL_CAPABILITY_MAP
  as 'core' so the capability-filter "all tools have capability tags"
  test passes.
- src/tools/__tests__/__snapshots__/tools-list.v1.11.snap.json: add
  oc_journal_compact to the baseline tools surface so the
  "tools/list matches v1.11.0 baseline snapshot" test passes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The comment claimed only the latest assertion per name is kept, but the
implementation (and its test) surface all failed oc_assert entries in
order. Fix the comment to match behavior.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…sion_id

- MCP `sampling/createMessage` may return `content` as a single block or an
  array of blocks. Parse both forms and take the first text block instead of
  only the scalar shape, so the summary is not silently dropped.
- Document that the `recent_steps` window is taken across all sessions before
  the `session_id` filter is applied, so a busy journal may yield fewer than
  `recent_steps` entries for one session.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@shaun0927
Copy link
Copy Markdown
Owner Author

Review & merge-readiness analysis (#1442)

Intent: New oc_journal_compact MCP tool that compresses a window of task-journal entries into a model-friendly summary. Part 1 of #1434.

Alignment with the SSOT (#1359)

Clean alignment:

  • P2 (harness, not agent): the sampling strategy is strictly host-mediated via sampling/createMessage and returns {status:"unsupported_by_host"} when the client does not advertise the capability — it never falls back to a server-side LLM.
  • P4 (facts before decisions): recent_k (default) and checkpoint_only are fully deterministic (char-based token heuristic, head-drop truncation that keeps the latest events).
  • Unknown-client safety: the deterministic default works with zero optional capabilities; READ_ONLY annotation, tool-annotation markdown mirror, and tools-list snapshot are all updated consistently.

Issues found & fixed

  1. Stale comment in pickOpenAssertions claimed "only the latest per assertion name" is kept, but the implementation (and its test) surface all failed oc_assert entries — corrected the comment (55d484e1).
  2. Sampling content parsing only handled the scalar {type,text} shape. The MCP spec allows content to be an array of blocks; an array response would have silently produced an empty summary. Now parses both forms and takes the first text block (b14793bf).
  3. session_id semantics were under-documented — the recent_steps window is taken across all sessions before the id filter, so a busy journal can yield fewer than recent_steps for one session. Clarified in the input-schema description (b14793bf).

Verdict

Specialized code review: MERGE-READY — 0 P0 / 0 P1. Local build clean; tests/tools/oc-journal-compact.test.ts 8/8 pass. Integration round-trip + cadence docs are correctly deferred to Part 2 of #1434.

shaun0927 and others added 2 commits May 29, 2026 01:48
…tability (#1444)

* docs(skill): document oc_journal_compact cadence and pin round-trip stability

Add docs/skills/trajectory-compaction.md describing recommended
compaction cadence (every 20–50 steps), strategy selection
(recent_k / checkpoint_only / sampling), and composition with the
LLM-free fast path (#1430 Part 2).

Add round-trip integration test that pins the lossy-but-stable
contract: two consecutive compactions over the same trajectory must
agree on open_assertions, last_checkpoint, and (for deterministic
strategies) the summary text. Without this gate, a future refactor of
the truncation logic could silently destabilise resume flows.

Part 2 of #1434.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* test(journal-compact): pin sampling-unsupported fallback; soften cadence doc

Adds a third round-trip case asserting that strategy:'sampling' returns
status:'unsupported_by_host' with no fabricated summary when the host lacks
the sampling capability — the exact #1359 guarantee (never server-side) that
was previously only asserted in prose. Also softens the cadence table to keep
the 20/50-step figures attributed to Webwright rather than implying an
OpenChrome-measured benchmark (#1359 P5).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
# Conflicts:
#	docs/mcp/tool-annotations.md
#	src/tools/index.ts
@shaun0927 shaun0927 merged commit 0373c50 into develop May 28, 2026
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant