Skip to content

Productization Events 121-124: PTSP + Signed Surface backbone + practice visibility layer + 'a way to think' framing#76

Merged
junjslee merged 41 commits into
masterfrom
productization-events-121-124-way-to-think
May 12, 2026
Merged

Productization Events 121-124: PTSP + Signed Surface backbone + practice visibility layer + 'a way to think' framing#76
junjslee merged 41 commits into
masterfrom
productization-events-121-124-way-to-think

Conversation

@junjslee
Copy link
Copy Markdown
Owner

Summary

This PR consolidates four Events (121-124) of the productization cycle that opened after Events 119-120 closed the per-task A/B depth-measurement path. All work is additive only; the soak-protected core/hooks/reasoning_surface_guard.py hot path and reasoning-surface@1 schema are untouched.

Primary identity reframe (Event 123): episteme is a way to think — 생각의 틀 — operationalized at the file system level. The five-stage cognitive practice (Frame → Decompose → Execute → Verify → Handoff) authored in core/memory/global/cognitive_profile.md + workflow_policy.md is the product. The signed Reasoning Surface, typed PTSP ledgers, pre-tool-use gate, standalone verifier, and Regulator Evidence Packet are scaffolding for the practice and residue from it. See docs/THE_WAY_TO_THINK.md.

Empirical anchor: the MIRROR benchmark (arXiv 2604.19809) — "providing models with their own calibration scores produces no significant improvement; only architectural constraint is effective." External constraint reduces LLM Confident Failure Rate from 0.60 to 0.14 across 5 frontier models. episteme is that external constraint at the operator decision layer.

What lands by Event

Event 121 — Phase 3 backbone (6b90618)

  • core/ptsp/ — Provenance-Tagged Step Pipeline (typed Fact/Inference/Unknown/Assumption ledgers + Promotion Gate, Invariants I1-I5) counters arXiv 2509.09677 self-conditioning
  • core/signing/ — Ed25519 signing + JCS canonicalization + RFC 3161 TSA shape + Sigstore Rekor inclusion-proof shape + canonical-surface envelope; zero-dep with HMAC-SHA256 fallback structurally distinguishable from production Ed25519
  • src/episteme/verify/ — standalone auditor verifier CLI, deterministic exit codes 0/10/11/12/13/14/20/21/30/64
  • docs/PRODUCTIZATION_PLAN.md + docs/COMPLIANCE_CROSSWALK.md

Event 122 — Practical realization + framing audit (b47dbfc)

  • src/episteme/surface/ — operator UX CLI (episteme surface author / sign / show / list / status / verify)
  • src/episteme/evidence/ — terminal-first viewer + Regulator Evidence Packet ZIP exporter (episteme evidence posture / register / show / alerts / packet build)
  • src/episteme/hooks/signed_surface_validator.py — opt-in PreToolUse hook running additively alongside reasoning_surface_guard.py
  • src/episteme/adapters/hermes.py — Hermes substrate bridge for signed-surface@1.0
  • pyproject [signing] PyNaCl optional dependency
  • CLI top-level wiring (episteme surface | evidence | verify)
  • Mid-Event audit corrected 5 category-error classes (audience misallocation, positioning treated as validated, substrate-neutrality erosion, commercialization assumption, hedging dishonesty)
  • 33 new tests

Event 123 — The way to think framing (1477fb4)

  • docs/THE_WAY_TO_THINK.md — primary identity doc (~2400 words). Every claim traces to operator-authored cognitive_profile.md / workflow_policy.md, the foundational mental models (Kahneman / Dalio / Boyd / Munger), or external research (MIRROR + long-horizon papers)
  • README.md + README.ko.md hero rewritten to lead with 생각의 틀
  • PRODUCTIZATION_PLAN § 0 + COMPLIANCE_CROSSWALK preamble + MARKETING_COPY_DRAFT three positionings rewritten to derive from THE_WAY_TO_THINK.md

Event 124 — Practice visibility layer (772a5ce)

  • core/practice/cognitive_moves.py — source-of-truth registry: 5 stages × N cognitive moves, each with named System-1 failure counter + schema-field mapping + doc anchor
  • core/practice/quality.pyobserve_surface() / observe_surfaces() returning gap observations (not a single-score grade — anti-gaming discipline)
  • src/episteme/_ui.py — zero-dep ANSI primitives (boxes, health indicators, sparklines, kv-table) with TTY + NO_COLOR + EPISTEME_NO_RICH detection
  • src/episteme/practice/episteme practice walk | retro | demo subcommand group
  • Hook error JSON now carries cognitive_move metadata (id + name + stage + counters + doc_anchor)
  • episteme surface author --interactive prompts now render with cognitive-move-name preamble + System-1 counter; brief practice-quality preview after authoring
  • episteme evidence posture panel upgraded with health indicators (green / yellow / red on signed % / chain breaks / test-mode signatures)
  • 64 new tests

Tests

pytest -q1050 passed, 54 subtests, zero regressions across all four Events.

Discipline preserved

  • Kernel dependencies = [] zero-dep posture preserved (PyNaCl is [signing] extra; _ui.py is stdlib-only)
  • Soak-protected core/hooks/reasoning_surface_guard.py + kernel/ docs untouched
  • No marketing copy cites the 70.3% CFR-reduction number bare — all citations point at arXiv 2604.19809 directly with the 0.60 → 0.14 number attributed to the paper, not to us
  • Practice-quality scoring exposed only via episteme practice retro (anti-gaming discipline)
  • Operator-controlled signing key; signing key is structurally out of the agent's reach

What this PR does NOT include

  • README.es.md + README.zh.md i18n parity translations (mechanical follow-up after EN + KO confirmed)
  • kernel/ docs cross-references to THE_WAY_TO_THINK.md (soak-protected; separate Event)
  • Web landing (epistemekernel.com) update (operator-gated production deploy)
  • Live Sigstore Rekor integration (deferred to Event 125 candidate)
  • OSF pre-registration submission (draft at docs/OSF_PRE_REGISTRATION_DRAFT.md; gated on Phase 2 recruitment)
  • Probe 1 / Probe 2 / Probe 3 outreach delivery (operator-gated)
  • Promotion of signed-surface@1.0 to default kernel path (gated on <100ms hot-path timing)

Try it (60 seconds)

episteme practice walk          # narrated 5-stage walkthrough
episteme practice demo          # worked-example surface body, narrated
episteme surface author -i      # author a real signed surface
episteme evidence posture       # view your audit trail

Test plan

  • All new modules pass pytest -q with zero regressions on the existing 986-test baseline (now 1050)
  • episteme practice walk names all 5 stages + cites operator-authored source docs
  • episteme practice demo --format json produces a body that validates against validate_surface_body() (i.e., it's surface sign-able)
  • Hook block JSON carries cognitive-move metadata; exit code remains 2 (Claude Code block contract)
  • EPISTEME_NO_RICH=1 falls back to plain ASCII; NO_COLOR respected per https://no-color.org
  • Standalone episteme verify round-trip across signed surfaces, mutations detected
  • Hermes signed-surface bridge writes ~/.hermes/SIGNED_SURFACE_PROTOCOL.md + schema reference

junjslee added 10 commits May 8, 2026 21:14
…fier CLI

Event 121 pivots positioning from "cognitive tool" to Compliance Evidence
Layer. The reframe survives the Event 119-120 saturation finding: model-output
depth lift is structurally hard to demonstrate at frontier model strength;
operator Calibration-Lift on irreversible decisions (MIRROR-aligned CFR
reduction) is the new load-bearing value claim. EU AI Act Article 12 high-risk
obligations apply 2026-08-02 — 84-day regulatory tailwind window.

What lands here (load-bearing slice; strictly additive, parallel-track):

- core/ptsp/  Provenance-Tagged Step Pipeline countering the self-conditioning
  effect (arXiv 2509.09677). Typed Fact/Inference/Unknown/Assumption ledgers,
  Promotion Gate (Invariants I1-I5), JCS canonicalization, typed-tag context
  injection.

- core/signing/  Cryptographic signing primitives. Zero-hard-dependency
  Ed25519 compat layer (PyNaCl when available, structurally-tagged HMAC-SHA256
  fallback for tests), Signed Reasoning Surface envelope schema with sign +
  verify, RFC 3161 TSA token shape, Sigstore Rekor inclusion-proof shape.
  Live TSA/Rekor integration deferred behind operator choice.

- src/episteme/verify/  Standalone auditor CLI with deterministic exit codes
  (0/10/11/12/13/14/20/21/30/64). Single/batch/chain modes. No runtime
  dependency on episteme — auditors can ship just this module + Python stdlib.

- docs/PRODUCTIZATION_PLAN.md  Phase 3-5 master plan.
- docs/COMPLIANCE_CROSSWALK.md  Field-by-field regulatory mapping:
  EU AI Act (Art. 12/13/14/19/72), NIST AI RMF + GenAI Profile (NIST AI
  600-1), Financial-services framework set (SR 11-7, OCC, EBA, MAS, OSFI,
  FINRA, SEC 17a-4(f)).

Discipline:
- Soak-protected kernel surfaces untouched. reasoning_surface_guard.py and
  reasoning-surface@1 schema continue to govern the live kernel; the new
  signed-surface@1.0 schema is parallel and opt-in.
- Loss-averse asymmetry posture: local commit only, no push, no PR, no
  publish, no Probe 1 outreach, no commercial entity action.
- Marketing copy that cites the 70.3% CFR-reduction number must hedge it as
  a MIRROR-aligned design target with OSF pre-registration link until the
  productive-run dataset exists.

Tests: pytest -q → 953 passed, 54 subtests passed, zero regressions.
New: 34 tests across tests/test_ptsp_promotion.py (15),
tests/test_signing_canonical_surface.py (9), tests/test_verify_cli.py (10).

Deferred (each with its own future Event + surface):
- Live Sigstore Rekor integration
- CCO dashboard MVP front-end (operator-gated on Probe 1 outcome)
- Phase 2 productive run dataset
- Probe 1 EU AI Governance outreach delivery
- Promotion of signed-surface@1.0 to default-path
Mid-Event 122 operator flagged two corrections that reshaped the build:

  1. LangSmith/Langfuse adapter was the wrong category (observability ≠
     substrate). Pivoted Task #15 to Hermes signed-surface bridge — peer
     substrate to Claude Code per pyproject keywords.

  2. Broader doc reconsideration. Audit returned 5 category-error classes
     in PRODUCTIZATION_PLAN.md; all 5 corrected in this Event:
       - audience misallocation (CCO weighted higher than operator)
       - positioning treated as validated (Compliance Evidence Layer was
         a hypothesis, not a measurement)
       - substrate-neutrality erosion (Skills Marketplace = Claude-only;
         LangSmith = observability)
       - commercialization assumption (Day-90 forced commercial-or-die)
       - hedging dishonesty (70.3% with OSF-link that doesn't exist)

Empirical anchor verified mid-Event: MIRROR benchmark (arXiv 2604.19809)
finding — "providing models with their own calibration scores produces
no significant improvement; only architectural constraint is effective."
External constraint reduces LLM CFR from 0.60 to 0.14 across 5 frontier
models. This is the load-bearing rationale for episteme's structural
mechanism over procedural prompting.

Framing rewrite (docs):
  - docs/PRODUCTIZATION_PLAN.md  full rewrite: rationale chain (Korean
    essay → MIRROR → architectural constraint → each artifact's role);
    three positioning hypotheses (Compliance / Operator Audit Trail /
    Pre-Action Reasoning Commitment) under structured probes, none
    declared validated; four-outcome Day-90 matrix (added operator-first
    OSS sustain); layer diagram replacing differentiation matrix; honest
    substrate-coverage table (Claude full / Hermes partial / Codex name-
    only / Cursor name-only / opencode name-only)
  - docs/COMPLIANCE_CROSSWALK.md  reframed preamble as downstream
    structural mapping, not primary positioning

Practical realization layer (additive build):
  - src/episteme/surface/    operator UX — author / sign / show / list /
    status / verify (already committed in chkpt 92252fb)
  - src/episteme/evidence/   auditor viewer + Regulator Evidence Packet
    exporter — posture / register / show / alerts / packet build
    (already committed in chkpt 92252fb)
  - src/episteme/hooks/signed_surface_validator.py  opt-in PreToolUse
    hook running additively alongside reasoning_surface_guard.py
    (already committed in chkpt 92252fb)
  - src/episteme/adapters/hermes.py  extended with signed-surface bridge
    — SIGNED_SURFACE_PROTOCOL.md + schema reference + governance
    addendum to ~/.hermes/ for substrate parity
  - src/episteme/cli.py  narrow Edit at entry point — top-level
    surface/evidence/verify subcommands pre-dispatched before main
    argparse so submodule flags pass through cleanly

Auxiliary docs:
  - docs/HOW_TO_AUTHOR_SIGNED_SURFACE.md   developer-facing walkthrough
  - docs/HOW_TO_VERIFY_EVIDENCE_PACKET.md  auditor-facing walkthrough
  - docs/LIVE_REKOR_DECISION.md            Sigstore public vs self-hosted
                                            vs hybrid vs none — operator
                                            decision matrix
  - docs/OSF_PRE_REGISTRATION_DRAFT.md     Phase 2 trial pre-reg draft
                                            ready for OSF submission
  - docs/MARKETING_COPY_DRAFT.md           three positioning candidate
                                            copies (A/B/C) — none landed
                                            in README

Tests: pytest -q → 986 passed, 54 subtests, zero regressions.
New (33 tests, 4 files):
  - tests/test_surface_cli.py            11
  - tests/test_evidence_cli.py            7
  - tests/test_signed_surface_validator_hook.py  9
  - tests/test_e2e_evidence_pipeline.py   6 (incl. Hermes-bridge artifact)

Dropped in Event 122 (per framing audit):
  - Skills Marketplace bundle task — Claude-only distribution; premature
  - LangSmith / Langfuse adapters — observability not substrate;
    deferred until Probe 1 signal arrives

Discipline:
  - Kernel zero-dep posture preserved (PyNaCl optional;
    test-mode HMAC structurally tagged)
  - Soak-protected core/hooks/reasoning_surface_guard.py untouched
  - Local commit only; no push, no PR, no publish, no Probe outreach
    delivery, no OSF submission, no commercial entity action
  - No marketing copy cites the 70.3% number bare; all citations point
    at arXiv 2604.19809 directly
The deeper framing correction. Event 122 fixed 5 category-error classes but
still described episteme as a forcing function / artifact / compliance evidence
layer. The operator's deeper correction:

  "nah bro... it should be fucking the way to think like nigga... put this somewhere."
  "씨발 다 고쳐놔. 우리가 해야할게 뭔지, 이걸 어떻게 제일 잘 framing할 수 있을지 생각해서 구현해놔."

The product is a way to think — 생각의 틀 — the five-stage cognitive practice
(Frame → Decompose → Execute → Verify → Handoff) authored in
core/memory/global/cognitive_profile.md + workflow_policy.md. The signed
Reasoning Surface, the typed PTSP ledgers, the pre-tool-use gate, the
standalone verifier, the Regulator Evidence Packet are scaffolding for the
practice and residue from it. Without the practice they are theater. With the
practice they are how the practice survives at frontier model strength.

Landed (additive only, no code changes):

  - docs/THE_WAY_TO_THINK.md  NEW primary identity doc (~2400 words).
    Operationalized index of the practice. Every claim traces to
    operator-authored cognitive_profile.md / workflow_policy.md, the
    foundational mental models (Kahneman / Dalio / Boyd / Munger), or
    external research (MIRROR arXiv 2604.19809; long-horizon arXiv 2509.09677).
    Names which artifact implements which cognitive move.

  - README.md hero rewrite + section header rewrite. Lead with 생각의 틀 /
    "a way to think." Deep content (ABCD blueprints, protocol synthesis,
    install) preserved unchanged.

  - README.ko.md parity translation of the new hero + section header.
    생각의 틀 kept as operator-coined load-bearing phrase.

  - docs/PRODUCTIZATION_PLAN.md § 0 rewrite. New § 0.1 ("the thing itself")
    points at THE_WAY_TO_THINK.md as primary identity. Compliance / packet /
    signed surface demoted to consequences of the practice. § 0.5 expanded
    with "not a prompt template" + "not an AI safety system" rows.

  - docs/COMPLIANCE_CROSSWALK.md preamble. Explicitly framed as residue of
    the practice. Structural fit with AI Act / NIST / FS-framework
    obligations is consequence-of-being-right, not goal.

  - docs/MARKETING_COPY_DRAFT.md preamble + three positioning headlines.
    Three positionings reframed as three audience-facing surfaces of ONE
    practice, not three separate identities. Each headline leads with the
    practice.

  - docs/PLAN.md / docs/PROGRESS.md / docs/NEXT_STEPS.md synced with
    Event 123 entries.

Deferred to follow-up Events (operator-gated):

  - README.es.md + README.zh.md i18n parity (mechanical translation; batch
    under operator review after EN + KO confirmed)
  - kernel/ DESIGN_V1_0_*.md and adjacent docs cross-reference to
    THE_WAY_TO_THINK.md
  - web/ + epistemekernel.com landing-page rewrite (irreversible production
    deploy; operator-gated)

Tests: pytest -q → 986 passed, 54 subtests, zero regressions. No code
changes; the 986-test suite is correct as enforcement geometry for the
practice and continues to pass without intervention.

Discipline:
  - Code untouched. Kernel-protected surfaces untouched.
  - Local commit only; no push.
  - No AI co-author trailer.
  - Every claim in THE_WAY_TO_THINK.md traces to a named source.
Operator authorized overnight autonomous continuation with three asks:
(1) verify code reflects the Event 123 "way to think" framing, (2) design
UX "high-grade unique and useful (visually, making it easier to understand
and use)," (3) realize the product as a real thing that can be created.
Subsequently lifted the loss-averse irreversible gate conditionally: "it
can be irreversible if that is the right direction for us."

What landed (additive only; soak-protected kernel untouched):

  - core/practice/cognitive_moves.py  source-of-truth registry of 5 stages
    × N cognitive moves with name + description + System-1 failure
    counter + schema-field mapping + doc anchor. Referenced by hook
    errors + practice CLI + quality observer.

  - core/practice/quality.py  observe_surface() / observe_surfaces() —
    gap observations against cognitive-move discipline. Severity:
    critical / warn / advisory / info. NOT a single-score grade
    (anti-gaming discipline; would induce optimizing for the score
    rather than the practice).

  - src/episteme/_ui.py  zero-dep ANSI primitives — boxes, colored
    headers, health indicators (● green/yellow/red, ASCII fallback
    [+]/[~]/[!]), sparklines (Unicode-block ASCII fallback), progress,
    kv-table. TTY + NO_COLOR + EPISTEME_NO_RICH detection. Stdlib only;
    kernel zero-dep posture preserved.

  - src/episteme/practice/  episteme practice walk | retro | demo
    subcommand group. walk = narrated 5-stage walkthrough with each
    cognitive move + System-1 counter. demo = worked-example surface
    body (narrated or JSON-only; JSON output validates against the
    surface builder so it's surface-sign-able). retro = practice
    retrospective with gap observations over time window.

  - src/episteme/cli.py  episteme practice registered at top-level
    (pre-argparse-dispatch pattern matching surface/evidence/verify).

  - src/episteme/hooks/signed_surface_validator.py  hook error JSON
    now includes a cognitive_move block (move_id + name + stage +
    counters + doc_anchor). Exit codes unchanged; the model + operator
    can now read failures as named cognitive-move violations rather
    than schema-field violations.

  - src/episteme/surface/_cli.py (interactive path only)  each prompt
    rendered with cognitive-move name + System-1 failure-counter
    preamble. Brief practice-quality preview after authoring shows
    which gaps episteme practice retro will surface. Non-interactive
    flags structurally untouched.

  - src/episteme/evidence/_viewer.py  upgraded posture panel: boxed
    sections with health indicators (signed % / chain breaks /
    test-mode-sig count colored green/yellow/red by threshold). JSON
    output unchanged for scripting.

Tests: pytest -q → 1050 passed, 54 subtests, zero regressions.
(Was 986 baseline + 64 new across 5 files.)

  - tests/test_practice_cognitive_moves.py  14 tests: registry
    consistency, helper fns, doc anchors, every move has a named
    System-1 counter
  - tests/test_practice_quality.py  11 tests: observation severity
    levels, retrospective aggregation, scenario-specific gap codes
  - tests/test_ui_rendering.py  24 tests: env-var detection, color
    forcing, health indicators (regular + inverse), boxes, headers,
    sparklines, progress, kv-table, Renderer dataclass
  - tests/test_practice_cli.py  11 tests: walk names all 5 stages +
    foundational models + source docs; demo narrated/JSON modes;
    demo JSON is validate_surface_body-valid; retro empty + populated;
    top-level CLI dispatch
  - tests/test_hook_cognitive_move_messages.py  4 tests: hook errors
    carry correct cognitive_move metadata; hook still returns exit 2
    (Claude Code block contract)

Discipline:
  - Kernel zero-dep posture preserved
  - Soak-protected core/hooks/reasoning_surface_guard.py + kernel/
    docs untouched
  - Practice quality scoring exposed ONLY via practice retro
    (anti-gaming discipline; not numeric score)
  - No AI co-author trailer
@vercel
Copy link
Copy Markdown

vercel Bot commented May 12, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
episteme Ready Ready Preview, Comment May 12, 2026 5:21pm

Two docs contained operationally sensitive material that shouldn't live
in the public repo:

  - docs/MARKETING_COPY_DRAFT.md  positioning hypotheses currently
    under Probe testing + literal cold-outreach script + probe
    deployment table mapping copy → channel → audience. Public exposure
    would let target audiences see the framing experiment being run.

  - docs/PRODUCTIZATION_PLAN.md  mixed: § 0 rationale + Phase 3-4
    technical scope (public-appropriate) BUT Phase 5 GTM probes
    (Probe 1 cold-outreach contact targeting + literal email template
    + Day-90 commercial-spin-off matrix + anti-self-deception
    protocol — sensitive). Moved whole-file private; operator can pull
    a public-implementation-summary back later if desired.

Both moved to ~/episteme-private/docs/ matching the existing
fully-private pattern of cp-*.md / POSTURE.md / NARRATIVE.md /
DECISION_STORY.md / etc. (No symlink stubs — those are reserved for
PLAN.md / PROGRESS.md / NEXT_STEPS.md which need public placeholders
for the hook chain to find authoritative docs.)

Cross-references updated:
  - docs/THE_WAY_TO_THINK.md  (2 refs) — replaced direct link with
    honest "GTM strategy held in operator's private notes" phrasing
  - docs/HOW_TO_VERIFY_EVIDENCE_PACKET.md  (2 refs) — replaced with
    inline descriptions that don't depend on the private docs
  - docs/OSF_PRE_REGISTRATION_DRAFT.md  (1 ref) — submission checklist
    item updated to point at private notes

No code changes; pytest suite unaffected.
@junjslee junjslee merged commit 9829d3f into master May 12, 2026
5 checks passed
@junjslee junjslee deleted the productization-events-121-124-way-to-think branch May 12, 2026 17:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant